Main

Website Audit Tutorial: Page Indexing In Google Search Console

Is Google indexing your pages? Google Search Console has a great report on page indexing and today you are going to learn how to use it! This Google Search Console Page Indexing Report tutorial is meant to show you how to use the various reports to help you understand if your pages are getting indexed and if they are not - what is preventing them from being indexed. 🔥🔥 This is #5 of a series of videos to help you learn how to audit your own website! This video includes: ⭐ What is page indexing ⭐ How Google Search Console can help you identify issues with page indexation ⭐ What is crawled not indexed ⭐ What is discovered not indexed ⭐ Samples of false positives (meaning Google is reporting an issue that does not really exist!) Check out more 👇 🐦 Tweet with me: https://twitter.com/iamjillcaren 💌 Get the newsletter: https://2dogs.media/ 🔗 Connect on LinkedIn: https://www.linkedin.com/in/jillcaren/ Download the website audit checklist: https://2dogs.media/website-audit-che... 📚 Chapters 00:10 What is page indexing? 00:43 Using Google Search Console 00:56 What are known pages 01:14 What are submitted pages 01:39 What are unsubmitted pages 02:29 Analyzing submitted pages 16:42 Analyzing unsubmitted pages

Jill Caren

1 year ago

Hey everybody, Jill here. Welcome to the next installment of the DIY website Audit series. Today we're gonna focus on sheet number two and it's gonna be page indexing. So what is page indexing? This is where Google will help us understand whether or not they're indexing our pages or not. And if they're not, why they're not. So there's a bunch of different things that we're gonna look at today. Basically, there's three components I will typically look at. Are pages being excluded because of a noi
ndex tag? Are pages being crawled but not indexed and are pages being discovered but not indexed? So these are the three main things, but within each of these things, there's a bunch of different details that we're gonna look at. So to get started, you jump into Google Search Console and you're gonna see pages. And then you'll see there's 524 indexed pages and 947 not indexed pages. I'm not concerned about this. The reason why is when this lands on this page, you're defaulting to what's known as
all known pages. So this is basically Google finding every page on your site. Maybe it's even finding pages for some reason that you did not want indexed. Usually when I start an audit, I'm gonna start with all submitted pages. What this means is these are the pages that I actually submitted to Google in my site map that I do want crawled. These are the pages I care about the most. So these are ones that are the most important for me to audit. All known pages is everything. All submitted pages
is the pages that you actually submitted and that you want crawled. And then unsubmitted is any of the other pages that Google found that you did not wanna crawl. Unsubmitted and all submitted will equal all known. If we look here, we've got 947. If I click all submitted, there's only 99 pages not indexed. And then unsubmitted, is 848. If you add up 848 and 99, you get your 947. I just wan to make sure that's clear to everybody. Then you can also actually look at each of your site maps. So for e
xample, if I wanted to look at my let's do become site map. I can see I have 43 of my submitted pages indexed and two or not. So I definitely wanna look at this because I would wanna know why those two are not being indexed. But for now, we're gonna start here. Cuz this is what I really care about. So I have 450 pages that are indexed. I am not worried about those. That's fantastic. I don't need to do anything there, but we're gonna investigate why I have 99 that are not indexed and it tells me
that there's five reasons for these not being indexed. Another cool thing, is that you could just hover over these just as a little reminder what each of these are. You can also look up here to see the last time this report was updated. So this one was February 20th and my primary crawler happens to be the desktop. Typically it's mobile, but every once in a while I'll see a desktop. Overall I can see over the last three months we've seen growth. So this is a lot of growth going on here. We went
from 224 pages indexed up to 408. This was because we did some programmatic seo, if that's something you're interested in. I do have another video on that on my YouTube channel. So I did submit, I wanna say it was about 160 pages programmatically. I did submit them sometime in January, but it clearly took them a while to get indexed. But I'm happy to say they're finally getting indexed. So that makes me happy. So now let's undo this cuz we don't care about this right now. Okay? And we could see
that not indexed. It looks like we were going down. Then it started growing, then it started growing. Now this aligns with my programmatic input and then you can see it's starting to drop a little bit. So this might have been, it was struggling to index those programmatic pages, but now it's starting to index them. So this is starting to slow a little bit. But let's look and find out. So we're gonna scroll down to the bottom where we can see why pages are not indexed. So here we have 68 pages th
at are crawled but currently not indexed. Okay, so I have 68. So here I would go to my website audit template. Are there pages being crawled but not indexed? I'm gonna put fair because I don't have an overwhelming amount, but I have enough to be concerned about. So let's look at these real quick. Tell me the Google system. Okay, you can look here. Okay, so this is a Google thing, tech Google saying hey, we crawled these pages but we're not gonna index them for one reason or another. What this is
telling me is that I tried to validate these pages and tried to tell Google, hey, these pages are actually okay, can we index them? And pretty much they told me, yeah, no they're not that good. So the trend is showing an increase. This means I have an increase in pages, which we can see the same up here. This is pretty much just replicating that and there's 68. So let's click on this. Okay, these pages aren't indexed. The validation, I started it and then it failed. And you can always look for
more details here and it'll tell you. So I didn't fail all of them yet. I have seven of the ones I submitted failed a second time and then there's 61 that are still pending review. Okay? So these are still going through the validation process, but these seven there's seven that did fail again once I sent in the validation. So what I might wanna do here is just take a look at some of these pages. Okay? So let's look at this one and just see what's going on here. Okay, this page sucks. The reality
is it sucks. There's no content on it, there's nothing going on here. I don't Blame them for not indexing it. Okay, let's see what's going on here. This one's better, still low quality. Maybe I'll add some additional content to the page, see if I can get Google to like it a little better. And these are probably all pretty much the same. They're probably all very kind of low quality. Not a ton of content on the pages. And so this one's a little better. So this one I might be a little more concer
ned with because this is actually a decent page. It's not horrible. So I might wanna just take a look at it. So there's a couple different things. If I see crawled and currently not indexed, there's a couple things I'll look at. One, is there links going to the page or do I have back links to the page? Internal links to the page. Is the quality really that great? And this is where you have to get outside of your head and really look at it from a user perspective. Look at the pages that are ranki
ng well for the term, see what they're doing, see if you can do it better. Same old thing. It's all about doing the content better. So I can clearly see that all of these pages are really not that great. So I might go back and just try and refi, just try and fix them up, make them better. So that's crawled currently not indexed. Let's jump down to the next one. Now we have discovered currently not indexed. What does this mean? This basically means that Google came to your site, saw the page, dec
ided not to crawl it and went on their merry way. So these pages are definitely not indexed and they probably haven't even been crawled yet. Okay, so I have 18 of these pages. This is what's on our template. Are there pages being discovered but not indexed? I'm gonna put fair cuz I don't have an overwhelming amount, okay? I didn't do any validation on this yet. Discovered currently not indexed. So these pages aren't indexed or served on Google and I haven't validated anything so I probably haven
't even looked at these yet. Last crawled na which that I don't see too often. So I don't know why these wouldn't be crawled. So I'm gonna take a look at this page. This is a camp page. Okay? So maybe there's just not enough content on the page and there's really nothing I can do about it. I don't see a lot of content on that page. Connecticut welding schools, there's no content on this page. Florida welding schools, there's no content on this page. So if I'm being honest, these pages suck. Okay
? There's no words on the page, there's no graph, there's no images, there's no information. So you know, Google came and you know, they just didn't like it. So there's nothing really you can do here other than improve the pages. Okay? Same thing here. This is just a short little page about the camp. So I'll have to figure out a way to better improve these camps. That again, same thing, do some internal linking, improve the page and then these, the difference with these, okay, if it hasn't been
crawled, which is what this means, what you can do. So say I want to get this one indexed. Okay? So what I can do is actually click on this inspect url. Oh it says URL is on Google. Okay, so it looks like I, it did get indexed. So hopefully it'll fall off that report pretty soon. Okay? So it looks like this one did get indexed, but if there is one that you have that has not been indexed, so this one is not on Google. So what I can possibly do here is do request indexing and then Google might dec
ide, okay, I can crawl it and index it now. Okay, so this is what you wanna kind of go through. Go through each of these but don't waste your time trying to inspect a URL of a page, you know, sucks. Like if I go to this one, there's no reason why I should even inspect the URL or ask Google to index this because there's nothing going on here, okay? So don't waste your time. Now one other thing you can do, okay? When you click that little icon and it gives you some more information here, sometimes
you can see when the page was crawled, which obviously like I said, anything that's discovered not indexed, they probably didn't crawl it yet. That's why there's no data in here. That's why it says na. They haven't crawled the page yet. So this will give you a little more information about where it was discovered. So they did find this URL in my site maps so they know it exists, they just haven't crawled it yet. So that's discovered currently not indexed. And let's jump back down. Now we have e
xcluded by no index tag. So here I have four and this is one of the things that is on the audit, okay? But before you market good, fair, poor, you wanna see, do you want these pages? No indexed. So if I click on this, okay, excluded by no index, okay? And I have four pages. Now I know I do want this no indexed, I do want this no indexed and I do want this no indexed. So these are fine. I have no problem with these pages and they're absolutely fine. So I'm gonna go over here, I'm gonna click good
because I don't have to worry about those. I don't have any pages that are not being indexed that I want indexed. The only one that is an issue is this one. So I'm gonna click on this, I'm gonna take a look at it in the URL first, okay? And this is fine, but I know this site that should not have a noindex tag on it. So what you might wanna do is you can right click do view page source and look at your robot's text file or your robot's content. And you can see here it says content follow index f
or the robots. So it's not telling Google to not index it and all the other pages are being indexed. So every once in a while you may find a glitch and this is just a glitch. So if I click on here, inspect url, okay, URL is not on Google indexing aloud it says no, no index detected in robots Metatag. But I just showed you in this view source there was not a no index. So, okay, now what, let's test here. Test robots dot text blocking. So there is no robots block blocking it. It says it's allowed.
Okay? So this is just a false positive and you will find this every so often. So if you see excluded by no index, you really should go through them all and just make sure, check your view page source check. If you chose to no index, it would yoast or rank math. You just wanna do your homework and make sure there's no issues. But there are often false positives throughout Google search console, not just with this one issue. Okay? So now we know our no index tag is fine and now we're gonna look a
t not found 404 typically guys I wouldn't obsess over these if there's only one, I probably wouldn't even bother looking at it It might just be a glitch. It might have just been the site was down for a second. It might have just been a little crawling bug, I wouldn't even bother looking at it. I'm only gonna show it to you cause I want you to understand what it looks like. So not found 404, it's just one page and example. So this page it's telling me was a not found 404, we're gonna click this a
nd the page is there and it's fine. So again this is a false positive that it's telling them there's a 404. Okay, let's, let's see what happens when we inspect url. URL is not on Google, okay? And it's saying when it went to fetch the page for some reason it got a 404, okay? So for this one, it's another one. I would request indexing and ask Google to come crawl it. Okay? So when you do this, it might take a minute or two, I'm gonna let this go, sorry for a little bit of the pause, but I want yo
u to see what it looks like if you've never done this. So what it'll do, it'll just tell Google, Hey Google, can you come see this page? I don't know why you didn't index it, but I'd like you to take another look at it. That's basically what this is telling it. So when it gets a chance and when your crawler comes back to your site, it'll take a look at the page and see if it's worth indexing. So you can see indexing requested URL was added to a priority crawl and only do it once. Don't try to su
bmit it more than once and click got it. And now just let it go and hopefully that'll drop off at some point. Okay? So that's it. For the 4 0 4 not found. And the last one would be the server error 500. This is typically because there's something going on on the server. So I actually did have a little bit of a glitch on my server which was my own fault. It was a bad piece of code. So I don't know if that had anything to do with this, but you can see there was some pages that for whatever reason
it hit a server error. But if you look at these, these are from November, so I wouldn't even worry about 'em. This is from months ago. There's nothing new, there's nothing to be concerned about. I did put them through for validation. Eventually this should just clear itself out. Okay, so let's go back to page indexing. So that's all submitted pages. Okay, so now maybe I wanna go here and just see what's going on here. Now this is gonna be a lot, pretty much the same at what we just looked at, bu
t it'll break it down into the different sections of your website if you have different site maps set up. So we have 43. I know I do have 45 of these articles. For some reason two of them are not being indexed. So I wanna see which ones. Okay, one is excluded by no index tag. Okay, this is fine. I don't want this to be indexed so that's perfect. And then the other one is crawled currently not indexed. And so this was not indexed but this, and then click too fast. You can see this was from Decemb
er 25th was the last time it was crawled. So let's just see if it is indexed now. So now it's saying URL is on Google. So again, a little bit of a, not a false positive. It probably was not indexed at back in December but it is fixed now. So now I can click validate fix and hopefully this will just fall off and then this will go away. So now we kind of went through the site maps. So now let's go through unsubmitted pages only. So these are the pages that Google found that were not part of my sit
e map. Okay, so there's 74 pages indexed that I did not have in my site map. I'm gonna take a look at those, I wanna see what's going on. For one, you can see that it's starting to drop off, which is good. Okay, but I wanna see what they are. So it's a 74 and up here it's a little bit different. So if you wanna see the ones that are indexed, you're gonna click here view data about indexed pages. Okay, so these are pages that are indexed that are, were not found in a site map, which these are fin
e. These are actually categories which I did not have site maps for. So it looks like it found the categories before I implemented a site map for them. Okay, so that's why these are showing up here. These are okay now that they are indexed, it is okay because I do want to index these now. But I guess through internal linking on my site, it might have found it before I submitted a site map. This is a brand new article so it looks like Google probably crawled this article before I submitted a new
site map with that article in it. Okay, so maybe it said it crawled on the February 20th, maybe my site map didn't go live until February 21st. So it's saying hey, we found this article and I do have some internal links to it. So it's saying hey, I found this article on February 20th but I didn't see it in your site map on February 20th. So we're just gonna place it here. Okay? So that is what all of these are. It's Google finding pages on my site that were not in my site map. They might be in m
y site map now or they may not be, we don't know. So you just wanna make sure you do want these pages indexed. I would go through all of 'em, make sure you do want them indexed. And if you don't then you need to make sure that you figure out a way to get this content not indexed. You can ask them to remove it. You can do a line of code in your robot's file or you can use Yost or rank math to ask that they not include it. Okay? And then not indexed. So let's take a look at this. So I have a lot o
f pages that Google's finding that are not indexed. Let's find out what's going on here. Okay, not found 404, I have 357 pages Google found that are showing a not found 404. When you click on this, can you see it's actually growing. Got it. So these are some city state pages. Let's see if this one even exists. This does not exist. Not sure how they found it. Okay, we can do a couple of things to see if we can figure it out. We can try and click on this just to see if there's anything going on. U
r URL is not on Google, there's no site map detected for this page. There's no referring page. Last crawl February 19th crawled as Google smartphone crawl allowed. Yes. Which is interesting. So this one I'd probably wanna really do a deeper dive into to understand what's going on. I don't even have this page on my site and I don't believe I have any internal links going to it. So this one I definitely wanna investigate and I really don't have an answer for this because I don't know why that's s
howing up. I might have had this city in there a while ago and deleted it because there wasn't enough data to f populate that city. Maybe it caught something from a while ago. So this is definitely an area that you'd wanna check. Because on a smaller site it's not a big deal but if this site was thousands and thousands of pages, having Google come and crawl 400 pages that don't even exist is really a waste of resources So you'd wanna look at this and understand, okay, why are they crawling these
pages and what can I do to stop it? What I can probably do here is add some kind of element to my robot's file to block out these pages from being crawled. But I'd have to do it in a way that I'm not disallowing valid trade school pages from being crawled. So this is gonna take a little bit of research on my end. Okay, again, this one has nothing found for let's look. This is actually a school which I don't think exists. So I have some work to do here to figure out what's going on. There's defi
nitely something weird going on here that it is picking these, these you know, you can see there was like a wrong url. These were all fixed already. So these were just URL issues that were broken URLs. So there's a lot of different things that can make up issues here. This was also a broken U R L, this has since been fixed and keep in mind, you know, you wanna look at the dates as well. You don't wanna go too far back cuz some of these might be fixed already. Okay? And I didn't do any validate f
ixes here, so I'll wait on that. So definitely take a look at page indexing not found and then excluded by no index tag. So 342 pages with no index and I know these do not have a no index tag but there's nothing going on here anyway. We haven't populated these pages yet, but there's definitely not a no index in the robots. And you could see, ooh, there you go. It actually does say no index just found a big b*****. So we definitely wanna make sure that I remove that. I don't know why there's a no
index there. And this is why we go through this process. So it found this page but it wouldn't index it because I'm telling it to no index it. Okay, Massachusetts, so this might be the same thing. View page. Obviously I need to add content to the page before I ask it to index it. Same thing I told it no index. Okay, so this is a valid issue I found by looking through global search console. So this is why we do what we do. Okay? So let's go back and on submitted pages. So we did excluded by no i
ndex page with redirect. I have 56 pages that are showing a redirect of some sort. Now we need to understand if they're valid or not. I know this one's wrong. I, this has been here a while and I can't figure out how to get rid of this one. how to become and then how to become a locksmith. When I first put this post up, this was the R url, I changed it because I realized how to become was in there twice. If I open a new tab, it does redirect to the right place. So that's great, but for some reaso
n it's still catching this and I don't understand why. If we do inspect url no referring site maps detected and it's telling me that this is the referring page. I might wanna look at this page, but I know that's not the case. This is a case of a false positive guys. If I go to how to become, there is nothing on that page that has this URL on it and I know that because I've already checked. So this is one where I might go and I might do ask them to validate this as being fixed because I know it's
not really an issue. So we're gonna do the same thing with this page that we do with all the other ones. Just go through, look at the dates first, you know, if it was crawled months ago, don't worry about it. If it's fairly recent, then you definitely wanna go through, make sure they are valid redirects. And then after that alternate, alternate page with proper canonical tag, I only have eight of those. And these are searches so I, I'm not even gonna worry about these. They're also from all fro
m 2022. So these are nothing actually I could probably do validate fix for these. I'll probably submit those as validate if it's back in 2022, just not worried about it. Okay, A canonical tag if you're not familiar with it means you have a webpage, maybe you put up another chocolate chip cookie recipe and you already have a chocolate chip cookie recipe, but you only want one of those pages to actually rank. You would put a canonical link from your newer post to your o older post. So a canonical
tag is basically telling Google, Hey, I have this second post about chocolate chip cookies, but I really only want you to index and care about the other one. You're gonna add that canonical link. It doesn't affect the user experience, it's just to tell Google that you have another page about the same topic but you want the other page to take preference in the Google search results. This is a way of dealing with duplicate content. I do use this in some scenarios. So that is what a canonical tag i
s. Soft 404, it's just, it might have just hit, it might have just hit. So this is all stuff I wouldn't even be worried about. It has to do with a map that I have. So there's really nothing going on there. I don't really see software oh four s as big issues too often. Server errors, 500 again, you know, the site could have been down for split second or maybe it's overloaded. These are actually feed URLs and you guys will probably, most of you will see a lot of these in your Google search console
. This is basically a WordPress thing where a pens feed at the end for your r s s feed. You can block those. I know I have them blocked on my site. So these should eventually not be here anymore either. So I don't really have an r s s feed on my site because nobody uses those anymore. I did remove mine, but if you see these in here, it's not a big deal. So don't even worry about it. It's not a bad thing. So not everything in here is an issue. You have to understand what is and what isn't. Anythi
ng with a feed appended at the end is not really an issue. Blocked by robots, text might be an issue. So service. So these are feed, again, not a big deal cause I have 'em blocked. I I and they're from 2022 anyway, so these are those feeds. I don't care about those. I'm only worried about pages that bring me money. That's really all you need to worry about. And these two we went over on some of the other sections. So this is everything. So now you just got an overview of what's going on with you
r page indexing, we know what we need to look at. And if you have any questions at all, cause I know this is a really hard topic, if you have any questions, please drop 'em in the comments on the video and I will do my best to answer them the best I can. Thanks very much guys and have a great day.

Comments

@shafikhan6935

I have discovered this channel and indexed in my library. Thanks . Your channel is gem .

@izharkhurshid7369

You are sharing some premium stuff for free. Highly appreciated. When should I expect the next video?

@khamismaiouf672

Hi Jill, thank you for your informative content. how can I access your premium template sheet?

@thetailwagwisdom

Hi Jill - Thanks again for the helpful video! I have a few questions. 1. Under unsubmitted pages, an alternative page with proper canonical tag
 A. Do I need to worry about those ending in “/?amp=1” I have 35 with this ending. 

 B. I have 2 with weird long endings to posts that begin after my post name then /?fbclid= (and a bunch of letters and numbers). But my post is still working. Do I care about this one or what is it?

 C. One looks like it’s part of my social media plugin, but my post still works without the added /?utm_medium=twitter&twitterutm_source=socialnetwork. I removed my Twitter account from my website, so that may have something to do with this error. 
 2. On the submitted pages, crawled not currently indexed There are a few posts that have referring pages from Pinterest that aren’t my pins. However, when I click on the pin links they go to the pin authors' page and not mine. Is this just a glitch? Should I care? 3. On unsubmitted pages, excluded not indexed
 My look to be all categories, pages, search, author, and privacy. Would it be correct that you wouldn’t necessarily want these pages indexed?

@samiulhaq318

Great video.. Do you have premium version of this Audit and how I can get that?