Saturday, December 31, 2011

Copying Click Through Relevancy Data in FAST Search for SharePoint

"FAST Search Server 2010 for SharePoint Click through Extractor Job (FASTQuerySSA)" is a job which runs on daily basis at 1:00 AM which create some files into "%FASTSEARCH%\components\resourcestore\generic\clickthrough" folder.

Basically this job uses the below table(s) from the FAST Query SSA Database  (For Example) "FASTQuerySSA_DB_d28f05ed845e4da69ac45eb72ee7d09" for creating the text file.

  • dbo.MSSQLog
  • dbo.LogClicks
  • dbo.MSSQLogNonClicked
  • dbo.MSSQLogQueryString
  • dbo.MSSQLogURL
SQL Queries which I tried at my end
select queryId, clickedurlId, lcid, clickedUrlRank  from dbo.MSSQLog, dbo.MSSQLogClicks where clickedurlid is not Null and dbo.MSSQLog.queryGuid=dbo.MSSQLogClicks.queryGuid and clickTime >='2011-07-13'

select queryId, nonClickedurlId, lcid,  rank from dbo.MSSQLog, dbo.MSSQLogNonClicked, dbo.MSSQLogClicks  where clickedurlId is not NULL and clickTime >= '2011-07-13'  and clickTime < '2011-07-13' and dbo.MSSQLogClicks.clickId=dbo.MSSQLogNonClicked.clickId and dbo.MSSQLog.queryGuid=dbo.MSSQLogClicks.queryGuid

select queryId, convert(varchar(10), lastReference, 102) as time, queryString from dbo.MSSQLogQueryString where lastReference > '2011-07-13'

select urlId, convert(varchar(10), lastReference, 102) as time, url from dbo.MSSQLogURL where lastReference > '2011-07-13'

Note: As of now I just checked with the sinlge date which is a static date, but some dates on the runtime is provided to any procedure which is executed by the timer job which is mentioned above.
  
Also this job works on the clicks which are clicked in last 30 days excluding the day on which this job is running. So it means that if you would try to execute with "Run now" option, it would not take those search results which are clicked on the same day.

There is another module "SPRel" which runs under FAST and it does the work of boosting the search results URL on the basis of text files which are available under "%FASTSEARCH%\components\resourcestore\generic\clickthrough" folder.

After boosting the documents it removes the text files from the folder with a timer job which it is having.


Recommended Approach

  • Exporting/Importing Click-Through Relevancy Table

As per analysis, the text files can not be used for copying the click through relevancy because these files are being deleted on the daily basis. Also using the files are not the good approach.

So as per my understanding, I was thinking to make the script for these tables which would overwrite the table contents of the target cluster database with overwriting the things.

After that once "FAST Search Server 2010 for SharePoint Click through Extractor Job" would be executed then it would create the "text files" in the "click through" folder and "SPRel" module/job would process the click through relevancy on those URL(S).

As of now I am not very much sure that this is the correct approach of doing the same or not. I am still looking for some more options into this area.