Are the CSV downloads malformatted, when a comma appears in a URL?
-
Howdy folks, we've been a PRO member for about 24 hours now and I have to say we're loving it! One problem I am having with however is a CSV exported from our crawl diagnostics summary that I've downloaded.
The CSV contains all the data fine, however I am having problems with it when a URL contains a comma. I am making a little tool to work with the CSVs we download and I can't parse it properly because there sometimes URLs contain commas and aren't quoted the same as other fields, such as meta_description_tag, are.
Is there something simple I'm missing or is it something that can be fixed?
Looking forward to learn more about the various tools. Thanks for the help.
-
I won't be too hard on the programmers - I'm a programmer myself. Our small business has developers and designers doing the bulk of the SEO. I can see you've looked in to it as I have - there are many factors involved if I was to decide to "fix" this myself. To be honest, I don't fancy it - I'm hoping the better approach will come from the wonderful SEO Moz developers who might put in a fix. Hint hint.
-
The first rule in this business is "You can't trust programmers"
I should know, I am a programmer and I used to manage teams of them.
You can't trust them to write something perfect, because they will always make huge assumptions, based on what they know.
They should know that URLs can contain commas, and they should quote them.
If they didn't do that in the final field, it is a deficiency in the code and your stuff isn't going to workunless you fix it manually.
What you need to do to fix this is to add a quote after the 10th comma and also add one at the end of each line.
Unfortunately, even that is a problem.
The problem is there are other fields that may not be quoted, some of which can start with http://
There can also be line breaks in the title field, and possibly even in the link text field.
Quotes and other characters are escaped with double quotes.
Titles and link text can also contain commas, so it is very complex.
Some of the fields are a bigger mess because it depends on the link text, and if the link text contains an image, you'll have quotes and equals signs, commas and all kinds of stuff. You can also have upper ascii characters and multibyte characters.
They did actually quote the first URL, if it contains commas.
They really should have quoted every field
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page Content on pages that appear to be different?
Hi Everyone! My name's Ross, and I work at CHARGED.fm. I worked with Luke, who has asked quite a few questions here, but he has since moved on to a new adventure. So I am trying to step into his role. I am very much a beginner in SEO, so I'm trying to learn a lot of this on the fly, and bear with me if this is something simple. In our latest MOZ Crawl, over 28K high priority issues were detected, and they are all Duplicate Page Content issues. However, when looking at the issues laid out, the examples that it gives for "Duplicate URLs" under each individual issue appear to be completely different pages. They have different page titles, different descriptions, etc. Here's an example. For "LPGA Tickets", it is giving 19 Duplicate URLs. Here are a couple it lists when you expand those:
Moz Pro | | keL.A.xT.o
http://www.charged.fm/one-thousand-one-nights-tickets
http://www.charged.fm/trash-inferno-tickets
http://www.charged.fm/mylan-wtt-smash-hits-tickets
http://www.charged.fm/mickey-thomas-tickets Internally, one reason we thought this might be happening is that even though the pages themselves are different, the structure is completely similar, especially if there are no events listed or if there isn't any content in the News/About sections. We are going to try and noindex pages that don't have events/new content on them as a temporary fix, but is there possibly a different underlying issue somewhere that would cause all of these duplicate page content issues to begin appearing? Any help would be greatly appreciated!0 -
Experts solve some query related Title Tag, Meta Tag Description, Link Building, URL
Title Tag Question Short title tag is more useful so if we just use our targeted keyword in home page title then is it useful.? for example my website: http://www.topnotchlawsuitloans.com/ i am targeting lawsuit loans keyword so if i use <title>TNF - Lawsuit Loans | Lawsuit Funding</title> is batter to use for main page or <title>Lawsuit Loans | Lawsuit Funding | As Low As 1% | PreSettlement Funding</title> can we have to use main targeting keyword on all webpage title tag ? my website have 200+ page and i have to use different title tag for that pages including targated keyword so if i am targeting lawsuit loans in that title, what is best to divide title pipe, hyphen or comma ? does capitalization in title tag wrong effect ? Lawsuit Loans - As low as 1% Lawsuit Loans | As low as 1% Lawsuit Loans, As low as 1% (or i have to use smaller cash in title) for all different page i want to place this kind of title is it best for SEO purpose Lawsuit Loans - Lawsuit Loans Fargo Lawsuit Loans - Lawsuit Loans Escondido Lawsuit Loans - Lawsuit Loans Erie Lawsuit Loans - Lawsuit Loans Flint Lawsuit Loans - Lawsuit Loans Fort Wayne Lawsuit Loans - Lawsuit Loans Fresno Lawsuit Loans - Lawsuit Loans Gainesville Lawsuit Loans - Lawsuit Loans Grand Rapids Lawsuit Loans - Lawsuit Loans Gilbert Lawsuit Loans - Lawsuit Loans Gresham Lawsuit Loans - Lawsuit Loans High Point Lawsuit Loans - Lawsuit Loans Hialeah Lawsuit Loans - Lawsuit Loans Huntsville if i am using this kind of different title for all page then it can effective for SEO or it will be come in keyword stuffing Meta Tag Description can we add meta tag description like this i mean targeted keyword before the description start, is it useful or not? important Meta tags please visit my http://www.topnotchlawsuitloans.com/ and inform me what are the important meta tags, so i can remove other tags Link Building Question i want to get rank in google for www.topnotchlawsuitloans.com so have to build backlinks with lawsuit loans alt tag but main question is this have to build or gain backlinks for this domain only or one of my website sub domain www.topnotchlawsuitloans.com/lawsuit-funding-philadelphia.html on page #6 so have to build backlink for this URL ??? what are the effective strategy to gain backlinks for main page or all sub pages have to build backlinks ?? how many backlink per keyword & page is good for website. URL i have to use targeted keyword on all sub page domain or not for example now i am using url like this format fundingtype.html litigation-funding.html legal-funding.html financingservices.html process.html and if i re-write all url with targated keyword like this format lawsuit-loans-fundingtype.html lawsuit-loans-litigation-funding.html lawsuit-loans-legal-funding.html lawsuit-loans-financingservices.html lawsuit-loans-process.html so which type URL are more effective for best SEO ??
Moz Pro | | JulieWhite0 -
Configure parameter effect in google wmt to reduce overly dynamic urls
We are looking at a weatherforecast site with realtime information that is updated every 5 minutes. For this website many urls have 6 parameters The SEOmoz campagne found duplicate information and overly dynamic urls. Then we went to google wmt section url parameters and configured parameters like day, month, year (effect: none). The next weekly SEOmoz campagne showed a big reduction in duplicates and small reduction overly dynamic urls. How can we reduce these 'errors' further?
Moz Pro | | theonlinefactory0 -
'Appropriate Use of Rel Canonical', Critical Factor but appears correct on page
Hi, Trying to get the following page ranked unsuccessfully.... http://www.joules.com/en-GB/2/Collections-Quilted-Jackets/c01c02.r16.1 Instead a product page is being ranked, shown below.... http://www.joules.com/en-GB/Womens-Quilted-Jacket/Navy/M_HAMPTON/ProductDetail.raction When I run the on page report card it advises that the Rel Canonical tag needs to point to that page, but we have checked and it looks to be doing that already. Has anyone else had an issue like this? Thanks, Martin
Moz Pro | | rockethot0 -
Campaign 4XX error gives duplicate page URL
I ran the report for my site and had many more 4xx errors than I've had in the past month. I updated my .htaccess to include 301 statements based on Google Webmaster Tools Crawl Errors. Google has been reporting a positive downward trend in my errors, but my SEOmoz campaign has shown a dramatic increase in the 4xx pages. Here is an example of an 4xx URL page: http://www.maximphotostudio.net/engagements/266/inniswood_park_engagements/http:%2F%2Fwww.maximphotostudio.net%2Fengagements%2F266%2Finniswood_park_engagements%2F This is strange because URL: http://www.maximphotostudio.net/engagements/266/inniswood_park_engagements/ is valid and works great, but then there is a duplicate entry with %2F representing forward slashes and 2 http statements in each link. What is the reason for this?
Moz Pro | | maximphotostudio1 -
Links in Open Site Explorer turning into downloads
Hey guys. This is my first question on here so hello 🙂 I have noticed recently a couple of times in Open Site Explorer, when I am checking out links, they are direct download links. The two I have noticed are flash files and with one companies links, dropbox related. Can anyone shed any light on this? I am pretty new to SEO and find it really confusing. Thanks in advance 🙂
Moz Pro | | Nextman0 -
Problems with OSE downloads
Ordered 5 reports last 24 hours, none received. Anyone else with this problem ? I do expect better from an expensive subscription. C'mon Moz, fix this new OSE report system please.
Moz Pro | | blocker04082 -
Why is Open site Explorer showing: No Data Available for this URL
Hi there, Im having a few problems getting my site www.incarmotorfactors.co.uk up and running on SEOMoz and im not sure what im doing wrong.Firstly seomoz shows 2 links for my site... Which is wrong. Google shows alot more. However the most noticable problem so far is Opensite explorer. When i type in the web address it shows "No Data Available for this URL" The site is more then a year old and has a few links, can anybody tell me what the problem may be?
Moz Pro | | Ev840