Published: Saturday, December 04, 1999
Searching Through the Text of Each File on a WebSite, Part 4
In Part 3 we examined the source code for the recusive function
GetFiles. Now, we still need to look at the function FormatURL. This
function translates a physical path into a pseudo-URL path. For example, C:\InetPub\wwwroot\scott\test.asp
would be translated into /scott/test.asp.
Function FormatURL(strPath)
'Cut off everything before wwwroot and replace all \ with /
Dim iPos
iPos = InStr(1,strPath,"wwwroot",1)
Dim str
str = Mid(strPath,iPos+7,Len(strPath))
FormatURL = Replace(str,"\","/")
End Function
|
Finally, we need to display the results of the search. This is accomplished by a single call to the
GetFiles function. Before we call the GetFiles function, we should take a
moment to see if the strLastFile parameter was passed in or not. If it was, we want to
set bolLFFound to False; else, we can just set bolLFFound to
True, since we do not need to first look for a particular file. Anyway, here is the code:
Below are the results of your search in no particular order...
Dim iResults
iResults = 0
'Now, recurse the directories
If Len(strLastFile) = 0 then
GetFiles objFolder,termsArray,strLastFile,True,bolAnd,iResults
Else
GetFiles objFolder,termsArray,strLastFile,False,bolAnd,iResults
End If
Set objFolder = Nothing
|
Note that the variable iResults will contain the number of records listed by
GetFiles. If iResults is less than 10, then we know that there are no more
files that match the search terms entered by the user. However, if iResults does indeed
equal 10, then their might be more results. In this case, we'll show the Show more results
link. This link will send the user to search.asp, passing all of the form field variables
through the querystring, including the last visited file. If, on the other hand, iResults
equals 0, then no results were found, and we should display a message to the user. The following code will
accomplish these tasks.
If iResults = 10 then
'Show next page link
%>
<P><HR><P><LI><FONT SIZE=2><B>
<A HREF="search.asp?terms=<%=Server.URLEncode(strKeywords)%>&boolean=<%=Request("boolean")%>&selSearchWhere=<%=Request("selSearchWhere")%>&lf=<%=Server.URLEncode(strLastFile)%>">
Show more results...
</A></FONT>
<P>
<% Elseif iResults = 0 then
'No results found %>
<B>No results found!</B><BR>
<FONT SIZE=2><A HREF="/search/">Try another search...</A></FONT>
<% End IF %>
|
Note that we use Server.URLEncode to ensure that the variables we are passing through the
querystring are properly formatted. If you are unfamiliar with Server.URLEncode, be sure to
read the technical documentation.
That wraps up the code for the search algorithm! The complete source is available at the end of this article.
Before we wrap things up, though, let's take a moment to analyze this algorithm. Recall from earlier in
the article that this approach is not ultra-efficient. However, to see its efficiency more precisely, a
thorough analysis is needed.
If we choose N to be the number of files that need to be searched through, then our analysis needs to
concentrate on GetFiles, since the remainder of the code will take a
constant time, regardless of the number of files to be searched. Clearly, since we have to iterate
at most N times, the search algorithm is in linear time asymtoticly. What search engine, isn't, though?
In the best case scenario, we will only have to iterate through C files, where C is the number of links we
are showing per paged result. In the worst case scenario, though, we will have to iterate through N-C
files. Of course, when iterating through these N-C files, we don't have to open the files, rather just
pass on by them. Still, as N gets large we have to step through a large number of files. Also, if we
choose C too big, we will have to open and close many files.
This is a simple analysis. I leave it to you to extend and apply it!
For a more thorough discussion on the efficiency of this and what other options apply,
be sure to check out
this messageboard post.
Happy Programming!
Read Part 3
Read Part 2
Read Part 1
Attachments
The HTML interface to the search engine, in text format
The source code for search.asp, in text format