Saturday, July 08, 2006

Search Engine - First Steps
With the web developer community already looking into next generation online solutions, it becomes imperative that user experience is seemless and tools are provided for the user which ensures optimum value with respect to time spent online. Search engines within applications have thus become integral part of every system. Today, i'll be discussing about the important factors to be kept in mind while developing a search engine for your database. Most of this i have learnt while working with my current organization. There are also stuff which i do not agree with, which i think will only result in loss of user base in the long run.
Here are some points which i think (and ofcourse experienced) will help you start.
Identify the user base - This is important so that you can provide further guidance to the user from the search results page, basis the search conducted. The search engine should be like a restaurant waiter :-) It should ask you everything about your taste (read requirements) and then offer you exactly what you had asked for. If the food (read results) does not taste good, you are unlikely to come back.
You should also provide guidance and suggestions so that the user gets to see the best result within no time. This set of facility can be divided into two.
A. When no or less number of results are returned -
  1. Users should be guided to broaden the search if their search criteria is too specific by using clouds.
  2. You can also show related results by doing "content mapping" in the background.
  3. Check for spellings and suggest correct words.
  4. Other concepts like "stemming" can be incorporated in the algo to show more results as per the requirement.
  5. Suggestions - Show suggestions basis your domain intelligence. This should be purely basis the historical data/logs of search you have.
B. When too many results are returned -
  1. Clustering - Categorize or classify the result set. This will help users to dig into more relevant results as per their requirement.
  2. Predefined Categories - Show a list of predefined categories.
  3. Search within search - Allow users to conduct search within search, so that they can drill into more relevant and specific results.
  4. Response time - Fast response to a search is more of a necessity these days. User should not be left waiting for the results. Important sections of the search results should load within 1-2 seconds is my recommendation. Care should be taken that by providing the above listed facilities, the response time is not affected. Search engine should be divided into various segments, and the best technology should be used for each one of these. A s/w developer may want to use different languages for different segments of the engine considering the processing time, easy of change etc.
Other factors important for a search engine are,
Page layout - The UI team should ensure that the page layout is such that the organic results should not be contaminated with advts and other paid results. Premium listings, advertisements etc should be separately shown. Also the construction should be such that the download of page is controlled basis importance of each section on the page.
All actionables on the page should be prominent. It should be analysed if these actions can be allowed on the same result page through DHTML, CSS etc. AJAX can be of great help here.
Navigation - Research has found that most of the users do not go beyond page 1-2. Therefore it is imperative that these pages should show the most relevant/fresh results as per the requirement. Also, prefetching of pages will result faster navigation across pages.
Domain Intelligence - Search engine developers and product managers should continuously monitor search logs and derive important information on user behaviour out of them. Most of the features related to search can be developed by studying the search dump. I would recommend that product managers should not get influenced by popular features on other search applications. They should rather study their own users' behaviour through search dumps and then conceptualize new features, sections etc.
S/w developers should be able to fine tune their algo by looking at search dumps and search performance logs.
Logging - Enhancements/ tuning of search engines is continuous. Therefore it becomes all the more important that all aspects of the search should be logged so that post enhancement analysis can be conducted and corresponding actions can be taken.
Research - Its an important part of any search engine development. Developers should be aware of latest technologies and findings. Right now web 2.0 is hot. It should be analysed with respect to your user base and the best suited technology/concept should be adopted. Choice of database and language should be carefully done.
Besides the above listed points there are other concepts like tagging, mashups etc which may contribute to your search peripherals. Now mark my words here when i say "peripherals". A search results page should not get contaminated with overwhelming hi-tech concepts. It should be simple and seemless.
I keep surfing net for niche search related technologies/concepts. Couple of websites which i frequently visit for search related news are searchenginewatch.com and battellemedia.com. The later one is John Battle's blog. His book "The Search" is worth a read. He keeps a close watch on developments in search engines, especially google. I also keep visiting jeremy's blog as it has some interesting stuff on MySQL.

Sunday, July 02, 2006

Search Engine - Supervised Rankings

Was going through various online journals on search engines when i stumbled upon this document published by mondosoft. Its interesting as they have argued the importance of human intervention in search engines. I wish i could attach the entire pdf here, but you may be able to download it from here
This document mainly covers points about gathering user behaviour data, interpreting and analyzing log data, providing informative results etc. Its worth a read. I am trying to get more whitepapers from them.
We have also concluded on similar findings. Our bottleneck lies in implementation. Don't even think that we are technically incapable. Its just that sometimes too much discussions amongst bright people results in a very stringent priority list. The best approach could be to have a clear plan for 6 months and work towards it. Its however easily said than done. For us, market feedback and faster turnaround time in crucial. We are always on our toes.
I am planning to write a small journal on search engine implementation soon....