Mapping and tracking old and new information architecture
-
Howdy.
So I'm working on "example.com", which has thousands of URLs. The site is going to be redesigned, with some changes to the information architecture.
I'm trying to think of a good way to organize and account for similarities and differences between the original information architecture and the new one. This should help with building 301s.
I've downloaded a list of URLs from example.com from Open Site Explorer. What I would love to do is generate a visual "tree" of the site based on the output from Open Site Explorer. It would basically look like a pyramid with all of the subfolders branching out.
Does anybody know of a tool out there that will do this for me? Or am I going to have a long day in Excel?
Any other thoughts on working through this process are welcome.
Thank you!
-
I wouldn't use OSE for this, given that they may not crawl all your urls
I suggest using a specifc cralwer that you can set to crawl the whole site. Give Xenu, Screaming frog or IIS toolkit a go to get a good idea of your urls. After you go live, make sure you have mapped all your old urls across to new ones
In my experience, a site never actually looks like a tree, people just like to describe it that way because its simple
S
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Domain Redirect from old domain with HTTPS
My domain was indexed with HTTPS://WWW. now that we redirected it the certificate has been removed and if you try to visit the old site with https it throws an obvious error that this sites not secure and the 301 does not happen. My question is will googles bot have this issue. Right now the domain has been in redirection status to the new domain for a couple months and the old site is still indexed, while the new one is not ranking well for half its terms. If that is not causing the problem can anyone tell me why would the 301 take such a long time. Ive double and quadruple checked the 301's and all settings to ensure its being redirected properly. Yet it still hasn't fully redirected. Something is wrong and my clients ready to ditch the old domain we worked on for a good amount of time. backgorund:About 30 days ago we found some redirect loops .. well not loop but it was redirecting from old domain to the new domain several times without error. I removed the plugins causing the multi redirects and now we have just one redirect from any page on the old domain to the new https version. Any suggestions? This is really frustrating me and I just can't figure it out. My only answer at this point is wait it out because others have had this issue where it takes up to 2 months to redirect the domain. My only issue is that this is the first domain redirect out of many that have ever taken more than a week or three.
Technical SEO | | waqid0 -
Old forum with 404s, what should I do?
Hello, So I'm helping out some friends with their SEO. I've just run a Screaming Frog crawl of their entire site (which took hours and hours I might add). They used to have a forum connected to the site, which is no longer active. Google is still indexing all of the old URLs, which unsurprisingly return 404 errors. What should they do to prevent Google from indexing these pages? That's assuming they need to do anything at all. They don't have access to these old forum posts and therefore won't be able to fix the URL or resource adding a 301 redirect pointing to the most relevant alternate page. I'm new to SEO but my instinct is that they need to have the page return a 410 ‘Gone’ response code to give search engines a clear signal that the page no longer exists and won’t be returning, and removing the internal links to that URL or resource. 1. Is this interpretation correct?
Technical SEO | | jordanayresaira
2. What is the impact of leaving these 404s? There are over a thousand, so there's a lot 3. What should I recommend?0 -
How can I stop a tracking link from being indexed while still passing link equity?
I have a marketing campaign landing page and it uses a tracking URL to track clicks. The tracking links look something like this: http://this-is-the-origin-url.com/clkn/http/destination-url.com/ The problem is that Google is indexing these links as pages in the SERPs. Of course when they get indexed and then clicked, they show a 400 error because the /clkn/ link doesn't represent an actual page with content on it. The tracking link is set up to instantly 301 redirect to http://destination-url.com. Right now my dev team has blocked these links from crawlers by adding Disallow: /clkn/ in the robots.txt file, however, this blocks the flow of link equity to the destination page. How can I stop these links from being indexed without blocking the flow of link equity to the destination URL?
Technical SEO | | UnbounceVan0 -
Basic SERP report does really gives me useful information?
Hi There! Ive seen so many times in the Basic SERP report, that pages/domains with low values are at the TOP of the SERP report, and i dont really understand that why can this be? I also have the same low values but i cant move higher for a keyword and i thought there is no question, i should improve my content/site, etc... But still dont understand why i see so many times that other domain/pages with lower values are at the top? On the attached screenshot, i am in the 12th position... Thx E. serp_report.jpg
Technical SEO | | Neckermann0 -
Is new created page's pagerank 1 ?
Hey I just want to know,
Technical SEO | | atakala
If I create a web page, is the pagerank of the page would be 1?1 -
Page authority old and new website
Dear all, I tried to find this question using the search option but cannot find the exact same problem. This is the thing: I launched a new website in January, replacing our old website that did pretty good in the SERPs. The old website is still running on a subdomain old.website.com and the new website is on www.website.com (www.denhollandsche.nl) Both sites are indexed by google right now, but I'm not sure if that's a good thing. For our main keyword, the page on the new website has an authority of "23" and the exact same page (some minor differences) on the old website still has an authority of "30". Both currently are on the second page of google while some time ago, they where still on position 2/3/4. My question is: if I would take down the old website and make a 301 redirect for the old page with P/A 30, to point to the new page with a P/A 23, will the p/a of this new page take over the P/A of the old page? What effects can I expect? The reason the old website is still running is that google images still shows images from old.domain.com in stead of images from the new website... Thanks for your help guys!
Technical SEO | | stepsstones0 -
Omniture tracking code URLs creating duplicate content
My ecommerce company uses Omniture tracking codes for a variety of different tracking parameters, from promotional emails to third party comparison shopping engines. All of these tracking codes create URLs that look like www.domain.com/?s_cid=(tracking parameter), which are identical to the original page and these dynamic tracking pages are being indexed. The cached version is still the original page. For now, the duplicate versions do not appear to be affecting rankings, but as we ramp up with holiday sales, promotions, adding more CSEs, etc, there will be more and more tracking URLs that could potentially hurt us. What is the best solution for this problem? If we use robots.txt to block the ?s_cid versions, it may affect our listings on CSEs, as the bots will try to crawl the link to find product info/pricing but will be denied. Is this correct? Or, do CSEs generally use other methods for gathering and verifying product information? So far the most comprehensive solution I can think of would be to add a rel=canonical tag to every unique static URL on our site, which should solve the duplicate content issues, but we have thousands of pages and this would take an eternity (unless someone knows a good way to do this automagically, I’m not a programmer so maybe there’s a way that I don’t know). Any help/advice/suggestions will be appreciated. If you have any solutions, please explain why your solution would work to help me understand on a deeper level in case something like this comes up again in the future. Thanks!
Technical SEO | | BrianCC0