I have a next.js website. I want to index the website for SEO and so the actual site can be visible on google. When I request indexing on google console, it says that the page cannot be indexed due to a unauthorized request (401).
I have a robots.txt file and sitemap.xml in the public folder.
Here is my robots.txt:
User-agent: *
Disallow:
Sitemap: https://cardcounter21.com/sitemap.xml
Sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<!-- Home Page -->
<url>
<loc>https://cardcounter21.com/home</loc>
<lastmod>2024-09-08</lastmod>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
<!-- Learn How Page -->
<url>
<loc>https://cardcounter21.com/learnhow</loc>
<lastmod>2024-09-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
<!-- Handbook Page -->
<url>
<loc>https://cardcounter21.com/handbook</loc>
<lastmod>2024-09-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<!-- Premium Handbook Page -->
<url>
<loc>https://cardcounter21.com/premiumhandbook</loc>
<lastmod>2024-09-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<!-- Card Counter Upsell Page -->
<url>
<loc>https://cardcounter21.com/cardcounterupsell</loc>
<lastmod>2024-09-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.7</priority>
</url>
</urlset>
Here is my middleware.ts:
import { authMiddleware } from "@clerk/nextjs";
import { NextResponse, NextRequest } from "next/server";
const isBot = (req: NextRequest) => {
const userAgent = req.headers.get('user-agent') || '';
const botUserAgents = [
'Googlebot',
'Bingbot',
'Yahoo! Slurp',
'DuckDuckBot',
'Baiduspider',
'YandexBot',
'Sogou',
'Exabot',
'facebot',
'ia_archiver'
];
return botUserAgents.some((bot) => userAgent.includes(bot));
};
export default authMiddleware({
afterAuth(auth, req, evt) {
if (isBot(req)) {
return NextResponse.next();
}
if (!auth.userId && !auth.isPublicRoute) {
const home = new URL("/home", req.url);
return NextResponse.redirect(home);
}
},
publicRoutes: [
"/sitemap",
"/home",
"/learnhow",
"/handbook",
"/premiumHandbook",
"/CardCounterUpsell",
"/sitemap.xml",
"/robots.txt",
"/api/simulation",
"/Images/Logo",
"/test",
"/api/webhook",
"/api/welcomeemails",
"/api/handbookEmails"
],
});
export const config = {
matcher: ["/((?!.+\.[\w]+$|_next).*)", "/", "/(api|trpc)(.*)"],
};
Here is a photo of the Error:
Any help would be greatly appreciated!
Thanks!
We also have an eCommerce website on Next.js.
In the beginning we also has this same issue of indexing. I believe the Sitemap URL you have mentioned above is not correct.
Check the correct URL of sitemaps in Next.js.
And you might fix this issue.
robots.txt example:
User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Sitemap: https://www.example.com/sitemap.xml
Reference https://developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt