org.opencms.site.xmlsitemap
Class CmsXmlSitemapGenerator

java.lang.Object
  extended by org.opencms.site.xmlsitemap.CmsXmlSitemapGenerator

public class CmsXmlSitemapGenerator
extends java.lang.Object

Class for generating XML sitemaps for SEO purposes, as described in http://www.sitemaps.org/protocol.html.


Nested Class Summary
protected  class CmsXmlSitemapGenerator.ResultEntry
          A bean that consists of a sitemap URL bean and a priority score, to determine which of multiple entries with the same URL are to be preferred.
 
Field Summary
static java.lang.String DEFAULT_CHANGE_FREQUENCY
          The default change frequency.
static double DEFAULT_PRIORITY
          The default priority.
protected  java.lang.String m_baseFolderRootPath
          The root path for the sitemap root folder.
protected  java.lang.String m_baseFolderSitePath
          The site path of the base folder.
protected  boolean m_computeContainerPageDates
          Flag to control whether container page dates should be computed.
protected  java.util.List<CmsDetailPageInfo> m_detailPageInfos
          The list of detail page info beans.
protected  java.util.Map<java.lang.String,java.util.List<CmsResource>> m_detailResources
          A map from type names to lists of potential detail resources of that type.
protected  com.google.common.collect.Multimap<java.lang.String,java.lang.String> m_detailTypesByPage
          A multimap from detail page root paths to corresponding types.
protected  CmsObject m_guestCms
          A CMS context with guest privileges.
protected  CmsPathIncludeExcludeSet m_includeExcludeSet
          The include/exclude configuration used for choosing pages for the XML sitemap.
protected  com.google.common.collect.Multimap<CmsUUID,CmsAlias> m_pageAliasesBelowBaseFolderByStructureId
          A map from structure ids to page aliases below the base folder which point to the given structure id.
protected  java.util.Map<java.lang.String,CmsXmlSitemapGenerator.ResultEntry> m_resultMap
          The map used for storing the results, with URLs as keys.
protected  CmsObject m_siteGuestCms
          A guest user CMS object with the site root of the base folder.
protected  java.lang.String m_siteRoot
          The site root of the base folder.
protected  java.lang.String m_siteRootLink
          A link to the site root.
 
Constructor Summary
CmsXmlSitemapGenerator(java.lang.String folderRootPath)
          Creates a new sitemap generator instance.
 
Method Summary
protected  void addResult(CmsXmlSitemapUrlBean result, int resultPriority)
          Adds an URL bean to the internal map of results, but only if there is no existing entry with higher internal priority than the priority given as an argument.
protected  long computeContainerPageModificationDate(CmsResource containerPage)
          Computes the container the container page modification date from its referenced contents.
 java.util.List<CmsXmlSitemapUrlBean> generateSitemapBeans()
          Generates a list of XML sitemap entry beans for the root folder which has been set in the constructor.
protected static java.lang.String getChangeFrequency(java.util.List<CmsProperty> properties)
          Gets the change frequency for a sitemap entry from a list of properties.
protected  java.lang.String getDetailLink(CmsResource pageRes, CmsResource detailRes, java.util.Locale locale)
          Gets the detail link for a given container page and detail content.
protected  java.util.List<I_CmsResourceType> getDetailTypesForPage(CmsResource resource)
          Gets the types for which a given resource is configured as a detail page.
protected  java.util.List<CmsResource> getDirectPages()
          Gets the list of pages which should be directly added to the XML sitemap.
 CmsPathIncludeExcludeSet getIncludeExcludeSet()
          Gets the include/exclude configuration of this XML sitemap generator.
protected  java.lang.String getInnerXmlForEntry(CmsXmlSitemapUrlBean entry)
          Writes the inner node content for an url element to a buffer.
protected  java.util.List<CmsResource> getNavigationPages()
          Gets the list of pages from the navigation which should be directly added to the XML sitemap.
protected  java.lang.String getUrlSetOpenTag()
          Gets the opening tag for the urlset element (can be overridden to add e.g. more namespaces.
protected  java.lang.String getXmlForEntry(CmsXmlSitemapUrlBean entry)
          Writes the XML for an URL entry to a buffer.
protected  boolean isAliasBelowBaseFolder(CmsAlias alias)
          Checks whether the given alias is below the base folder.
protected static void removeInternalFiles(java.util.List<CmsResource> resources)
          Removes files marked as internal from a resource list.
 java.lang.String renderSitemap()
          Generates a sitemap and formats it as a string.
 void setComputeContainerPageDates(boolean computeContainerPageDates)
          Enables or disables computation of container page dates.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_CHANGE_FREQUENCY

public static final java.lang.String DEFAULT_CHANGE_FREQUENCY
The default change frequency.

See Also:
Constant Field Values

DEFAULT_PRIORITY

public static final double DEFAULT_PRIORITY
The default priority.

See Also:
Constant Field Values

m_baseFolderRootPath

protected java.lang.String m_baseFolderRootPath
The root path for the sitemap root folder.


m_baseFolderSitePath

protected java.lang.String m_baseFolderSitePath
The site path of the base folder.


m_computeContainerPageDates

protected boolean m_computeContainerPageDates
Flag to control whether container page dates should be computed.


m_detailPageInfos

protected java.util.List<CmsDetailPageInfo> m_detailPageInfos
The list of detail page info beans.


m_detailResources

protected java.util.Map<java.lang.String,java.util.List<CmsResource>> m_detailResources
A map from type names to lists of potential detail resources of that type.


m_detailTypesByPage

protected com.google.common.collect.Multimap<java.lang.String,java.lang.String> m_detailTypesByPage
A multimap from detail page root paths to corresponding types.


m_guestCms

protected CmsObject m_guestCms
A CMS context with guest privileges.


m_includeExcludeSet

protected CmsPathIncludeExcludeSet m_includeExcludeSet
The include/exclude configuration used for choosing pages for the XML sitemap.


m_pageAliasesBelowBaseFolderByStructureId

protected com.google.common.collect.Multimap<CmsUUID,CmsAlias> m_pageAliasesBelowBaseFolderByStructureId
A map from structure ids to page aliases below the base folder which point to the given structure id.


m_resultMap

protected java.util.Map<java.lang.String,CmsXmlSitemapGenerator.ResultEntry> m_resultMap
The map used for storing the results, with URLs as keys.


m_siteGuestCms

protected CmsObject m_siteGuestCms
A guest user CMS object with the site root of the base folder.


m_siteRoot

protected java.lang.String m_siteRoot
The site root of the base folder.


m_siteRootLink

protected java.lang.String m_siteRootLink
A link to the site root.

Constructor Detail

CmsXmlSitemapGenerator

public CmsXmlSitemapGenerator(java.lang.String folderRootPath)
                       throws CmsException
Creates a new sitemap generator instance.

Parameters:
folderRootPath - the root folder for the XML sitemap to generate
Throws:
CmsException - if something goes wrong
Method Detail

getChangeFrequency

protected static java.lang.String getChangeFrequency(java.util.List<CmsProperty> properties)
Gets the change frequency for a sitemap entry from a list of properties.

If the change frequency is not defined in the properties, this method will return null.

Parameters:
properties - the properties from which the change frequency should be obtained
Returns:
the change frequency string

removeInternalFiles

protected static void removeInternalFiles(java.util.List<CmsResource> resources)
Removes files marked as internal from a resource list.

Parameters:
resources - the list which should be replaced

generateSitemapBeans

public java.util.List<CmsXmlSitemapUrlBean> generateSitemapBeans()
                                                          throws CmsException
Generates a list of XML sitemap entry beans for the root folder which has been set in the constructor.

Returns:
the list of XML sitemap entries
Throws:
CmsException - if something goes wrong

getIncludeExcludeSet

public CmsPathIncludeExcludeSet getIncludeExcludeSet()
Gets the include/exclude configuration of this XML sitemap generator.

Returns:
the include/exclude configuration

renderSitemap

public java.lang.String renderSitemap()
                               throws CmsException
Generates a sitemap and formats it as a string.

Returns:
the sitemap XML data
Throws:
CmsException - if something goes wrong

setComputeContainerPageDates

public void setComputeContainerPageDates(boolean computeContainerPageDates)
Enables or disables computation of container page dates.

Parameters:
computeContainerPageDates - the new value

addResult

protected void addResult(CmsXmlSitemapUrlBean result,
                         int resultPriority)
Adds an URL bean to the internal map of results, but only if there is no existing entry with higher internal priority than the priority given as an argument.

Parameters:
result - the result URL bean to add
resultPriority - the internal priority to use for updating the map of results

computeContainerPageModificationDate

protected long computeContainerPageModificationDate(CmsResource containerPage)
                                             throws CmsException
Computes the container the container page modification date from its referenced contents.

Parameters:
containerPage - the container page
Returns:
the computed modification date
Throws:
CmsException - if something goes wrong

getDetailLink

protected java.lang.String getDetailLink(CmsResource pageRes,
                                         CmsResource detailRes,
                                         java.util.Locale locale)
Gets the detail link for a given container page and detail content.

Parameters:
pageRes - the container page
detailRes - the detail content
locale - the locale for which we want the link
Returns:
the detail page link

getDetailTypesForPage

protected java.util.List<I_CmsResourceType> getDetailTypesForPage(CmsResource resource)
Gets the types for which a given resource is configured as a detail page.

Parameters:
resource - a resource for which we want to find the detail page types
Returns:
the list of resource types for which the given page is configured as a detail page

getDirectPages

protected java.util.List<CmsResource> getDirectPages()
                                              throws CmsException
Gets the list of pages which should be directly added to the XML sitemap.

Returns:
the list of resources which should be directly added to the XML sitemap
Throws:
CmsException - if something goes wrong

getInnerXmlForEntry

protected java.lang.String getInnerXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the inner node content for an url element to a buffer.

Parameters:
entry - the entry for which the content should be written
Returns:
the inner XML

getNavigationPages

protected java.util.List<CmsResource> getNavigationPages()
Gets the list of pages from the navigation which should be directly added to the XML sitemap.

Returns:
the list of pages to add to the XML sitemap

getUrlSetOpenTag

protected java.lang.String getUrlSetOpenTag()
Gets the opening tag for the urlset element (can be overridden to add e.g. more namespaces.

Returns:
the opening tag

getXmlForEntry

protected java.lang.String getXmlForEntry(CmsXmlSitemapUrlBean entry)
Writes the XML for an URL entry to a buffer.

Parameters:
entry - the XML sitemap entry bean
Returns:
an XML representation of this bean

isAliasBelowBaseFolder

protected boolean isAliasBelowBaseFolder(CmsAlias alias)
Checks whether the given alias is below the base folder.

Parameters:
alias - the alias to check
Returns:
true if the alias is below the base folder