Question regarding search crawler sitemap.xml

classic Classic list List threaded Threaded
12 messages Options
Ard
Reply | Threaded
Open this post in threaded view
|

Question regarding search crawler sitemap.xml

Ard
When we create a search crawler sitemap.xml, do we normally tell the
location where to find this sitemap.xml through robot.txt or do we
inform Google/Yahoo ourselves or do they by default look at some
predefined location, like /sitemap.xml

I've read already [1] and [2] but afaics from that, you need to inform
where the sitemap.xml or sitemap index can be found. How do we do this
by normally?

Regards Ard

[1] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183668
[2] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=71453
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Jeroen Reijn
Administrator
I guess that depends on the client. Google itself looks in the
robots.txt or we/the customer submits it.

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183669

AFAIK there are not a lot of customers that are using this.


On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
<[hidden email]> wrote:

> When we create a search crawler sitemap.xml, do we normally tell the
> location where to find this sitemap.xml through robot.txt or do we
> inform Google/Yahoo ourselves or do they by default look at some
> predefined location, like /sitemap.xml
>
> I've read already [1] and [2] but afaics from that, you need to inform
> where the sitemap.xml or sitemap index can be found. How do we do this
> by normally?
>
> Regards Ard
>
> [1] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183668
> [2] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=71453
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Jeroen Reijn
Solution Architect
Hippo

Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com

http://about.me/jeroenreijn
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Wouter Pasman
In reply to this post by Ard
We use the robots.txt file in Minbuza [1] http://www.minbuza.nl/robots.txt 


On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers <[hidden email]> wrote:
When we create a search crawler sitemap.xml, do we normally tell the
location where to find this sitemap.xml through robot.txt or do we
inform Google/Yahoo ourselves or do they by default look at some
predefined location, like /sitemap.xml

I've read already [1] and [2] but afaics from that, you need to inform
where the sitemap.xml or sitemap index can be found. How do we do this
by normally?

Regards Ard

[1] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183668
[2] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=71453
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html


_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Ard
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Ard
On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]> wrote:
> We use the robots.txt file in Minbuza [1] http://www.minbuza.nl/robots.txt

Ah great! However, I was trying something similar for the hippo
connect, but do not seem to be able to add this Sitemap thing to the
robots.txt from the forge plugin.

Regards Ard

>
>
> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers <[hidden email]>
> wrote:
>>
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Wouter Danes-2
Hi Ard,

Yes, I haven't created something that talks to the robots.txt plugin yet from the sitemap plugin.
It'd be nice to have something in there for sure.
Which reminds me, we should probably release that forge sitemap plugin some time :)

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Ard Schrijvers
Sent: dinsdag 24 juli 2012 15:16
To: Hippo CMS 7 development public mailinglist
Subject: Re: [Hippo-cms7-user] Question regarding search crawler sitemap.xml

On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]> wrote:
> We use the robots.txt file in Minbuza [1]
> http://www.minbuza.nl/robots.txt

Ah great! However, I was trying something similar for the hippo connect, but do not seem to be able to add this Sitemap thing to the robots.txt from the forge plugin.

Regards Ard

>
>
> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
> <[hidden email]>
> wrote:
>>
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Ard
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Ard
On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes <[hidden email]> wrote:
> Hi Ard,
>
> Yes, I haven't created something that talks to the robots.txt plugin yet from the sitemap plugin.
> It'd be nice to have something in there for sure.

There is a robot txt hippo forge project, but in there I could not
find a way to tell search crawlers where my sitemap (index) is
located. Hence I thought it might just be forgotten for quite some
sides : Thus, sitemap is nicely created, but never thought of
registering it to search crawlers. Doing it by hand for google is too
limited. There are more search engines

Ard

> Which reminds me, we should probably release that forge sitemap plugin some time :)
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Ard Schrijvers
> Sent: dinsdag 24 juli 2012 15:16
> To: Hippo CMS 7 development public mailinglist
> Subject: Re: [Hippo-cms7-user] Question regarding search crawler sitemap.xml
>
> On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]> wrote:
>> We use the robots.txt file in Minbuza [1]
>> http://www.minbuza.nl/robots.txt
>
> Ah great! However, I was trying something similar for the hippo connect, but do not seem to be able to add this Sitemap thing to the robots.txt from the forge plugin.
>
> Regards Ard
>
>>
>>
>> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> <[hidden email]>
>> wrote:
>>>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Wouter Danes-2
Aye, the robots.txt forge project is too limited, at RO.nl we manually edited the jsp that robots uses to include the sitemap, but it'd be nicer to be able to specify the sitemap from the plugin.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Ard Schrijvers
Sent: dinsdag 24 juli 2012 16:03
To: Hippo CMS 7 development public mailinglist
Subject: Re: [Hippo-cms7-user] Question regarding search crawler sitemap.xml

On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes <[hidden email]> wrote:
> Hi Ard,
>
> Yes, I haven't created something that talks to the robots.txt plugin yet from the sitemap plugin.
> It'd be nice to have something in there for sure.

There is a robot txt hippo forge project, but in there I could not find a way to tell search crawlers where my sitemap (index) is located. Hence I thought it might just be forgotten for quite some sides : Thus, sitemap is nicely created, but never thought of registering it to search crawlers. Doing it by hand for google is too limited. There are more search engines

Ard

> Which reminds me, we should probably release that forge sitemap plugin
> some time :)
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Ard
> Schrijvers
> Sent: dinsdag 24 juli 2012 15:16
> To: Hippo CMS 7 development public mailinglist
> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
> sitemap.xml
>
> On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]> wrote:
>> We use the robots.txt file in Minbuza [1]
>> http://www.minbuza.nl/robots.txt
>
> Ah great! However, I was trying something similar for the hippo connect, but do not seem to be able to add this Sitemap thing to the robots.txt from the forge plugin.
>
> Regards Ard
>
>>
>>
>> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> <[hidden email]>
>> wrote:
>>>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Mathijs Brand
Reading this thread I was surprised. The robots.txt plugin used to be just a plain text file you could edit in the CMS. I see it's been improved, but now the flexibility is a bit less. Maybe we can just add a plain text field in the robots.txt plugin, so you can add custom rules like this one. 

From http://www.sitemaps.org/protocol.html#submit_robots
Specifying the Sitemap location in your robots.txt file
You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line including the full URL to the sitemap:
Sitemap: http://www.example.com/sitemap.xml

Kind regards,
Mathijs Brand
Hippo

On Tue, Jul 24, 2012 at 10:17 AM, Wouter Danes <[hidden email]> wrote:
Aye, the robots.txt forge project is too limited, at RO.nl we manually edited the jsp that robots uses to include the sitemap, but it'd be nicer to be able to specify the sitemap from the plugin.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Ard Schrijvers
Sent: dinsdag 24 juli 2012 16:03
To: Hippo CMS 7 development public mailinglist
Subject: Re: [Hippo-cms7-user] Question regarding search crawler sitemap.xml

On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes <[hidden email]> wrote:
> Hi Ard,
>
> Yes, I haven't created something that talks to the robots.txt plugin yet from the sitemap plugin.
> It'd be nice to have something in there for sure.

There is a robot txt hippo forge project, but in there I could not find a way to tell search crawlers where my sitemap (index) is located. Hence I thought it might just be forgotten for quite some sides : Thus, sitemap is nicely created, but never thought of registering it to search crawlers. Doing it by hand for google is too limited. There are more search engines

Ard

> Which reminds me, we should probably release that forge sitemap plugin
> some time :)
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Ard
> Schrijvers
> Sent: dinsdag 24 juli 2012 15:16
> To: Hippo CMS 7 development public mailinglist
> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
> sitemap.xml
>
> On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]> wrote:
>> We use the robots.txt file in Minbuza [1]
>> http://www.minbuza.nl/robots.txt
>
> Ah great! However, I was trying something similar for the hippo connect, but do not seem to be able to add this Sitemap thing to the robots.txt from the forge plugin.
>
> Regards Ard
>
>>
>>
>> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> <[hidden email]>
>> wrote:
>>>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam Boston - 1 Broadway, Cambridge, MA 02142

US <a href="tel:%2B1%20877%20414%204776" value="+18774144776">+1 877 414 4776 (toll free)
Europe <a href="tel:%2B31%280%2920%20522%204466" value="+31205224466">+31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html


_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Ard
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Ard
On Tue, Jul 24, 2012 at 11:02 PM, Mathijs Brand <[hidden email]> wrote:
> Reading this thread I was surprised. The robots.txt plugin used to be just a
> plain text file you could edit in the CMS. I see it's been improved, but now
> the flexibility is a bit less. Maybe we can just add a plain text field in
> the robots.txt plugin, so you can add custom rules like this one.

It was the first time I looked at it, and it also surprised me as
indeed, I wanted to tell google/yahoo etc through the robot.txt where
it could find the sitemap

Regards Ard

>
> From http://www.sitemaps.org/protocol.html#submit_robots
> Specifying the Sitemap location in your robots.txt file
> You can specify the location of the Sitemap using a robots.txt file. To do
> this, simply add the following line including the full URL to the sitemap:
> Sitemap: http://www.example.com/sitemap.xml
>
> Kind regards,
> Mathijs Brand
> Hippo
>
> On Tue, Jul 24, 2012 at 10:17 AM, Wouter Danes <[hidden email]>
> wrote:
>>
>> Aye, the robots.txt forge project is too limited, at RO.nl we manually
>> edited the jsp that robots uses to include the sitemap, but it'd be nicer to
>> be able to specify the sitemap from the plugin.
>>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Ard
>> Schrijvers
>> Sent: dinsdag 24 juli 2012 16:03
>> To: Hippo CMS 7 development public mailinglist
>> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> sitemap.xml
>>
>> On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes <[hidden email]>
>> wrote:
>> > Hi Ard,
>> >
>> > Yes, I haven't created something that talks to the robots.txt plugin yet
>> > from the sitemap plugin.
>> > It'd be nice to have something in there for sure.
>>
>> There is a robot txt hippo forge project, but in there I could not find a
>> way to tell search crawlers where my sitemap (index) is located. Hence I
>> thought it might just be forgotten for quite some sides : Thus, sitemap is
>> nicely created, but never thought of registering it to search crawlers.
>> Doing it by hand for google is too limited. There are more search engines
>>
>> Ard
>>
>> > Which reminds me, we should probably release that forge sitemap plugin
>> > some time :)
>> >
>> > -----Original Message-----
>> > From: [hidden email]
>> > [mailto:[hidden email]] On Behalf Of Ard
>> > Schrijvers
>> > Sent: dinsdag 24 juli 2012 15:16
>> > To: Hippo CMS 7 development public mailinglist
>> > Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> > sitemap.xml
>> >
>> > On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]>
>> > wrote:
>> >> We use the robots.txt file in Minbuza [1]
>> >> http://www.minbuza.nl/robots.txt
>> >
>> > Ah great! However, I was trying something similar for the hippo connect,
>> > but do not seem to be able to add this Sitemap thing to the robots.txt from
>> > the forge plugin.
>> >
>> > Regards Ard
>> >
>> >>
>> >>
>> >> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> >> <[hidden email]>
>> >> wrote:
>> >>>
>> > _______________________________________________
>> > Hippo-cms7-user mailing list and forums
>> > http://www.onehippo.org/cms7/support/forums.html
>> > _______________________________________________
>> > Hippo-cms7-user mailing list and forums
>> > http://www.onehippo.org/cms7/support/forums.html
>>
>>
>>
>> --
>> Amsterdam - Oosteinde 11, 1017 WT Amsterdam Boston - 1 Broadway,
>> Cambridge, MA 02142
>>
>> US +1 877 414 4776 (toll free)
>> Europe +31(0)20 522 4466
>> www.onehippo.com
>> _______________________________________________
>> Hippo-cms7-user mailing list and forums
>> http://www.onehippo.org/cms7/support/forums.html
>> _______________________________________________
>> Hippo-cms7-user mailing list and forums
>> http://www.onehippo.org/cms7/support/forums.html
>
>
>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Gerrit Berkouwer
Ard, you can also submit the sitemap.xml via Google Webmaster Tools, see http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183669

Greetings, Gerrit

Op dinsdag 24 juli 2012 schreef Ard Schrijvers ([hidden email]) het volgende:
On Tue, Jul 24, 2012 at 11:02 PM, Mathijs Brand <<a href="javascript:;" onclick="_e(event, &#39;cvml&#39;, &#39;m.brand@onehippo.com&#39;)">m.brand@...> wrote:
> Reading this thread I was surprised. The robots.txt plugin used to be just a
> plain text file you could edit in the CMS. I see it's been improved, but now
> the flexibility is a bit less. Maybe we can just add a plain text field in
> the robots.txt plugin, so you can add custom rules like this one.

It was the first time I looked at it, and it also surprised me as
indeed, I wanted to tell google/yahoo etc through the robot.txt where
it could find the sitemap

Regards Ard

>
> From http://www.sitemaps.org/protocol.html#submit_robots
> Specifying the Sitemap location in your robots.txt file
> You can specify the location of the Sitemap using a robots.txt file. To do
> this, simply add the following line including the full URL to the sitemap:
> Sitemap: http://www.example.com/sitemap.xml
>
> Kind regards,
> Mathijs Brand
> Hippo
>
> On Tue, Jul 24, 2012 at 10:17 AM, Wouter Danes <[hidden email]>
> wrote:
>>
>> Aye, the robots.txt forge project is too limited, at RO.nl we manually
>> edited the jsp that robots uses to include the sitemap, but it'd be nicer to
>> be able to specify the sitemap from the plugin.
>>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Ard
>> Schrijvers
>> Sent: dinsdag 24 juli 2012 16:03
>> To: Hippo CMS 7 development public mailinglist
>> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> sitemap.xml
>>
>> On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes <[hidden email]>
>> wrote:
>> > Hi Ard,
>> >
>> > Yes, I haven't created something that talks to the robots.txt plugin yet
>> > from the sitemap plugin.
>> > It'd be nice to have something in there for sure.
>>
>> There is a robot txt hippo forge project, but in there I could not find a
>> way to tell search crawlers where my sitemap (index) is located. Hence I
>> thought it might just be forgotten for quite some sides : Thus, sitemap is
>> nicely created, but never thought of registering it to search crawlers.
>> Doing it by hand for google is too limited. There are more search engines
>>
>> Ard
>>
>> > Which reminds me, we should probably release that forge sitemap plugin
>> > some time :)
>> >
>> > -----Original Message-----
>> > From: [hidden email]
>> > [mailto:[hidden email]] On Behalf Of Ard
>> > Schrijvers
>> > Sent: dinsdag 24 juli 2012 15:16
>> > To: Hippo CMS 7 development public mailinglist
>> > Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> > sitemap.xml
>> >
>> > On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman <[hidden email]>
>> > wrote:
>> >> We use the robots.txt file in Minbuza [1]
>> >> http://www.minbuza.nl/robots.txt
>> >
>> > Ah great! However, I was trying something similar for the hippo connect,
>> > but do not seem to be able to add this Sitemap thing to the robots.txt from
>> > the forge plugin.
>> >
>> > Regards Ard
>> >
>> >>
>> >>
>> >> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> >> <[hidden email]>
>> >> wrote:
>> >>>
>> > _______________________________________________
>> > Hippo-cms7-user mailing list and forums
>> > http://www.onehippo.org/cms7/support/forums.html
>> > _______________________________________________
>> > Hippo-cms7-user mailing list and forums
>> > http://www.onehippo.o

_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
--
Greetz, Gerrit
Ard
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Ard
On Wed, Jul 25, 2012 at 12:15 AM, Gerrit Berkouwer
<[hidden email]> wrote:
> Ard, you can also submit the sitemap.xml via Google Webmaster Tools, see
> http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183669

I know, but I don't want to :-)

I want it automatically, which can be achieved through the robots.txt,
but I didn't see this option in the Hippo forge robots project: You
can there only disallow URLs and do not include a google sitemap.
Also, what about Yahoo, Bink, etc : They all honor the robots.txt, it
is much easier to automate it through there instead of submitting it
by hand

I've created an issue for it, see [1]

Regards Ard

[1] https://issues.onehippo.com/browse/HIPPLUG-458

>
> Greetings, Gerrit
>
> Op dinsdag 24 juli 2012 schreef Ard Schrijvers ([hidden email])
> het volgende:
>
>> On Tue, Jul 24, 2012 at 11:02 PM, Mathijs Brand <[hidden email]>
>> wrote:
>> > Reading this thread I was surprised. The robots.txt plugin used to be
>> > just a
>> > plain text file you could edit in the CMS. I see it's been improved, but
>> > now
>> > the flexibility is a bit less. Maybe we can just add a plain text field
>> > in
>> > the robots.txt plugin, so you can add custom rules like this one.
>>
>> It was the first time I looked at it, and it also surprised me as
>> indeed, I wanted to tell google/yahoo etc through the robot.txt where
>> it could find the sitemap
>>
>> Regards Ard
>>
>> >
>> > From http://www.sitemaps.org/protocol.html#submit_robots
>> > Specifying the Sitemap location in your robots.txt file
>> > You can specify the location of the Sitemap using a robots.txt file. To
>> > do
>> > this, simply add the following line including the full URL to the
>> > sitemap:
>> > Sitemap: http://www.example.com/sitemap.xml
>> >
>> > Kind regards,
>> > Mathijs Brand
>> > Hippo
>> >
>> > On Tue, Jul 24, 2012 at 10:17 AM, Wouter Danes
>> > <[hidden email]>
>> > wrote:
>> >>
>> >> Aye, the robots.txt forge project is too limited, at RO.nl we manually
>> >> edited the jsp that robots uses to include the sitemap, but it'd be
>> >> nicer to
>> >> be able to specify the sitemap from the plugin.
>> >>
>> >> -----Original Message-----
>> >> From: [hidden email]
>> >> [mailto:[hidden email]] On Behalf Of Ard
>> >> Schrijvers
>> >> Sent: dinsdag 24 juli 2012 16:03
>> >> To: Hippo CMS 7 development public mailinglist
>> >> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> >> sitemap.xml
>> >>
>> >> On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes
>> >> <[hidden email]>
>> >> wrote:
>> >> > Hi Ard,
>> >> >
>> >> > Yes, I haven't created something that talks to the robots.txt plugin
>> >> > yet
>> >> > from the sitemap plugin.
>> >> > It'd be nice to have something in there for sure.
>> >>
>> >> There is a robot txt hippo forge project, but in there I could not find
>> >> a
>> >> way to tell search crawlers where my sitemap (index) is located. Hence
>> >> I
>> >> thought it might just be forgotten for quite some sides : Thus, sitemap
>> >> is
>> >> nicely created, but never thought of registering it to search crawlers.
>> >> Doing it by hand for google is too limited. There are more search
>> >> engines
>> >>
>> >> Ard
>> >>
>> >> > Which reminds me, we should probably release that forge sitemap
>> >> > plugin
>> >> > some time :)
>> >> >
>> >> > -----Original Message-----
>> >> > From: [hidden email]
>> >> > [mailto:[hidden email]] On Behalf Of Ard
>> >> > Schrijvers
>> >> > Sent: dinsdag 24 juli 2012 15:16
>> >> > To: Hippo CMS 7 development public mailinglist
>> >> > Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> >> > sitemap.xml
>> >> >
>> >> > On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman
>> >> > <[hidden email]>
>> >> > wrote:
>> >> >> We use the robots.txt file in Minbuza [1]
>> >> >> http://www.minbuza.nl/robots.txt
>> >> >
>> >> > Ah great! However, I was trying something similar for the hippo
>> >> > connect,
>> >> > but do not seem to be able to add this Sitemap thing to the
>> >> > robots.txt from
>> >> > the forge plugin.
>> >> >
>> >> > Regards Ard
>> >> >
>> >> >>
>> >> >>
>> >> >> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> >> >> <[hidden email]>
>> >> >> wrote:
>> >> >>>
>> >> > _______________________________________________
>> >> > Hippo-cms7-user mailing list and forums
>> >> > http://www.onehippo.org/cms7/support/forums.html
>> >> > _______________________________________________
>> >> > Hippo-cms7-user mailing list and forums
>> >> > http://www.onehippo.o
>
>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding search crawler sitemap.xml

Gerrit Berkouwer
Great addition :-).

Gerrit

2012/7/25 Ard Schrijvers <[hidden email]>
On Wed, Jul 25, 2012 at 12:15 AM, Gerrit Berkouwer
<[hidden email]> wrote:
> Ard, you can also submit the sitemap.xml via Google Webmaster Tools, see
> http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183669

I know, but I don't want to :-)

I want it automatically, which can be achieved through the robots.txt,
but I didn't see this option in the Hippo forge robots project: You
can there only disallow URLs and do not include a google sitemap.
Also, what about Yahoo, Bink, etc : They all honor the robots.txt, it
is much easier to automate it through there instead of submitting it
by hand

I've created an issue for it, see [1]

Regards Ard

[1] https://issues.onehippo.com/browse/HIPPLUG-458

>
> Greetings, Gerrit
>
> Op dinsdag 24 juli 2012 schreef Ard Schrijvers ([hidden email])
> het volgende:
>
>> On Tue, Jul 24, 2012 at 11:02 PM, Mathijs Brand <[hidden email]>
>> wrote:
>> > Reading this thread I was surprised. The robots.txt plugin used to be
>> > just a
>> > plain text file you could edit in the CMS. I see it's been improved, but
>> > now
>> > the flexibility is a bit less. Maybe we can just add a plain text field
>> > in
>> > the robots.txt plugin, so you can add custom rules like this one.
>>
>> It was the first time I looked at it, and it also surprised me as
>> indeed, I wanted to tell google/yahoo etc through the robot.txt where
>> it could find the sitemap
>>
>> Regards Ard
>>
>> >
>> > From http://www.sitemaps.org/protocol.html#submit_robots
>> > Specifying the Sitemap location in your robots.txt file
>> > You can specify the location of the Sitemap using a robots.txt file. To
>> > do
>> > this, simply add the following line including the full URL to the
>> > sitemap:
>> > Sitemap: http://www.example.com/sitemap.xml
>> >
>> > Kind regards,
>> > Mathijs Brand
>> > Hippo
>> >
>> > On Tue, Jul 24, 2012 at 10:17 AM, Wouter Danes
>> > <[hidden email]>
>> > wrote:
>> >>
>> >> Aye, the robots.txt forge project is too limited, at RO.nl we manually
>> >> edited the jsp that robots uses to include the sitemap, but it'd be
>> >> nicer to
>> >> be able to specify the sitemap from the plugin.
>> >>
>> >> -----Original Message-----
>> >> From: [hidden email]
>> >> [mailto:[hidden email]] On Behalf Of Ard
>> >> Schrijvers
>> >> Sent: dinsdag 24 juli 2012 16:03
>> >> To: Hippo CMS 7 development public mailinglist
>> >> Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> >> sitemap.xml
>> >>
>> >> On Tue, Jul 24, 2012 at 3:45 PM, Wouter Danes
>> >> <[hidden email]>
>> >> wrote:
>> >> > Hi Ard,
>> >> >
>> >> > Yes, I haven't created something that talks to the robots.txt plugin
>> >> > yet
>> >> > from the sitemap plugin.
>> >> > It'd be nice to have something in there for sure.
>> >>
>> >> There is a robot txt hippo forge project, but in there I could not find
>> >> a
>> >> way to tell search crawlers where my sitemap (index) is located. Hence
>> >> I
>> >> thought it might just be forgotten for quite some sides : Thus, sitemap
>> >> is
>> >> nicely created, but never thought of registering it to search crawlers.
>> >> Doing it by hand for google is too limited. There are more search
>> >> engines
>> >>
>> >> Ard
>> >>
>> >> > Which reminds me, we should probably release that forge sitemap
>> >> > plugin
>> >> > some time :)
>> >> >
>> >> > -----Original Message-----
>> >> > From: [hidden email]
>> >> > [mailto:[hidden email]] On Behalf Of Ard
>> >> > Schrijvers
>> >> > Sent: dinsdag 24 juli 2012 15:16
>> >> > To: Hippo CMS 7 development public mailinglist
>> >> > Subject: Re: [Hippo-cms7-user] Question regarding search crawler
>> >> > sitemap.xml
>> >> >
>> >> > On Tue, Jul 24, 2012 at 3:13 PM, Wouter Pasman
>> >> > <[hidden email]>
>> >> > wrote:
>> >> >> We use the robots.txt file in Minbuza [1]
>> >> >> http://www.minbuza.nl/robots.txt
>> >> >
>> >> > Ah great! However, I was trying something similar for the hippo
>> >> > connect,
>> >> > but do not seem to be able to add this Sitemap thing to the
>> >> > robots.txt from
>> >> > the forge plugin.
>> >> >
>> >> > Regards Ard
>> >> >
>> >> >>
>> >> >>
>> >> >> On Tue, Jul 24, 2012 at 3:05 PM, Ard Schrijvers
>> >> >> <[hidden email]>
>> >> >> wrote:
>> >> >>>
>> >> > _______________________________________________
>> >> > Hippo-cms7-user mailing list and forums
>> >> > http://www.onehippo.org/cms7/support/forums.html
>> >> > _______________________________________________
>> >> > Hippo-cms7-user mailing list and forums
>> >> > http://www.onehippo.o
>
>
> _______________________________________________
> Hippo-cms7-user mailing list and forums
> http://www.onehippo.org/cms7/support/forums.html



--
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US <a href="tel:%2B1%20877%20414%204776" value="+18774144776" target="_blank">+1 877 414 4776 (toll free)
Europe <a href="tel:%2B31%280%2920%20522%204466" value="+31205224466" target="_blank">+31(0)20 522 4466
www.onehippo.com
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html


_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/forums.html
--
Greetz, Gerrit