Endeca component diagram

Posted by Peter Curran   /   , ,

What is Endeca?  At first glance, Endeca can appear to be very overwhelming.  With so many different components and installations, it’s hard to just jump right in.  However, by breaking it down piece by piece, Endeca becomes much more approachable.  As the Guided Search and Navigation component of Oracle Commerce, Endeca provides a platform for eCommerce web applications.  To the business user, Endeca provides a way to configure various aspects of your web application and build web pages on the fly.  Users can merchandise different aspects of their site, allowing configuration of product spotlights, landing pages, different relevancy ranking strategies and more.  All in all, Endeca is a tool to provide a positive consumer experience to online shoppers.  Let’s take a look.  The following essay describes the details of the accompanying Endeca component diagram.

Data Sources and the ITL – Loading your dataIf you’re just entering into the Endeca world, the first step into getting your Endeca application setup is loading your data, in what Endeca calls the Information Transformation Layer (ITL).  Data can come from a wide variety of sources.  While many applications get their data from text extracts,  you can also pull in data from websites, databases, a CMS and more.  If you are using Oracle Commerce with ATG and Endeca, Oracle provides an ATG-Endeca integration which makes the process of passing of ATG catalog data to Endeca much easier.  ATG has an indexing component that outputs records to a CAS Record Store instance.  The Content Acquisition Service (CAS) is a tool used to pull in data from outside sources through what it calls “crawls”.  The data in these record stores can then be configured and indexed by Endeca in multiple ways.

Most raw data (extracts and database sources) are consumed directly by Endeca’s Forge component.  The Forge takes its configuration from what Endeca calls a pipeline.  A pipeline is basically just a collection of XML configuration files that can be modified via Oracle Developer Studio.  When building your pipeline in Developer Studio, you can pull in data and perform joins and manipulations of that data, as well as configuring data fields to be either a property (product-specific fields) or dimension (fields used for facets/refinements).

The Forge also allows you to configure your search relevance ranking strategy.  This includes configuring which fields are indexed for search, as well as configuring Endeca’s out-of-the-box Relevance Ranking Modules.  These modules are meant to help provide more relevant search results to the user and their configuration is usually very specific to the data being pulled into Endeca.

Once your data is loaded through the pipeline, you can run a baseline update.  This is a script that kicks off the ITL process, first starting the Forge process which outputs records and property/dimension configurations to Endeca’s high-speed indexer, the Dgidx.  This indexer then takes those records and configurations and creates an index from them to be used in your Endeca application for search and navigation.  The Dgidx outputs this index and configuration XML which is picked up by a running instance of the MDEX, called a Dgraph, which is the software which stores the index and configuration.  The index in the Dgraph can be viewed in Endeca’s JSP reference application, which we like to call the “Orange App”.  For more information on viewing and using the Orange App, look here.

The Endeca Workbench and Experience Manager – The fun stuffOnce you have built your index of products, the fun begins.  Endeca’s tool for business users to manage their Endeca-driven web application is called the Workbench.  The workbench provides multiple uses for both the business user and technical users as well.  From here, you can configure thesaurus entries (synonyms) for your search engine as well as keyword redirects.  These tools are provided to help allow control over the search experience after your index has been built.  With these tools you can modify your search engine in real time without having to modify the pipeline configuration.  These changes are written to what’s called the Endeca Configuration Repository which is used by the Dgraph when users perform searches.  Business owners can also test different search relevancy strategies from the Relevancy Ranking Evaluator, which lets users input searches and see products, all while adding and removing Endeca’s various relevancy ranking modules.

Endeca’s Workbench also provides Experience Manager, an interface which can be used to build web application pages and configure/merchandise various aspects of the web application.  This interface allows you  to control various parts of your e-commerce application, allowing configuring landing pages for certain product categories or brands, product spotlights, typeahead, relevance ranking strategies and much more.  Endeca splits each component of a page (e.g. guided navigation, results, search box, menus) into cartridges, which are associated with sections of HTML/JSP/.NET code.  This makes it very easy to design pages on the fly, by adding and removing cartridges in one click. Experience Manager gives the business user total control of what shows up where, all in real-time alongside your production web application.  And with the push of a button, you can promote your changes from Experience Manager to your live site.

In addition, more technical users can also Endeca scripts from the Workbench, as well as manage CAS crawls which are part of the ITL process mentioned above.

The Frontend Web Application LayerEndeca has APIs that make it very easy to integrate Endeca with your frontend web application.  Endeca provides the Assembler to handle requests from your frontend application.  The assembler can be used “embedded” in the same process as your web application, or as a service on any valid J2EE container.  The assembler handles requests for pages from experience manager, and handles getting the right code and data for the cartridges that you configured in Experience Manager for each page.  Using the Assembler API (embedded approach) or requests to an Assembler service, your frontend application can obtain data from the Dgraph which holds your product index.  That data is then returned in a Content Item to the frontend application to be used for rendering your JSP or ASP pages.

There are a few more Endeca modules that can be used to enhance your web application.  The SEO Module allows for URLs to be formatted just how you want it.  There is also the Sitemap Generator, which uses the SEO module to create sitemaps for your frontend application, to help search engines better index your site.  Endeca also has a logging server which can generate reports that can be viewed through the Workbench.

Final ThoughtsAlthough this is only a brief overview of Endeca, hopefully it gives you a better idea of how to get started if you are thinking of integrating Endeca into your e-commerce site or simply just better understand your current Endeca implementation.  To learn more about implementing Endeca, check out our presentation from Oracle OpenWorld on Oracle’s Omni-Channel Journey.

Need further strategic or technical help?