
May 15th, 2004
02:51 AM
i play bass wee.
Status: Offline!
Outputting flat file search results
Ok, I am pretty sure this is possible, but I am very confused of how to go about doing this. I want to create a script that searches a flat file for a keyword, and if that keyword is within a line, have that line diplayed as a search result. It would have to work for multiple lines.
<?php
//songs.txt
The Who - My Wife
Radiohead - Like Spinning Plates
Radiohead - Just
The Beatles - I Want You (Shes so Heavy)
?>
So if the user searched for 'radiohead' it would output:
Radiohead - Like spinning plates
Radiohead - Just
The search would have to be case insensitive as well.
All I have so far is this:
<?
$fp = fopen('songs.txt','r');
$ul = fread($fp,filesize('songs.txt'));
fclose($fp);
// break up the file into lines
//
$bits = explode("\n",$ul);
// look for our arttist / song
if (in_array($song,$bits)) {
echo $song . ' exists!'; //this is where it would output the lines where the song is found
} else {
echo $song . ' not found.';
}
?>
Thanks in advance guys!

May 15th, 2004
03:19 AM
I assume the file is not so large so you could theoretically do something like:
<?
$search_soundex = soundex( $search );
$fp = file( 'songs.txt' );
foreach( $fp as $lineNumber => $text ){
$words = explode( " ", $text );
foreach( $words as $iWord ) {
if( trim( strtolower( $search ) ) == strtolower( $iWord ) || soundex( $iWord ) == $search_soundex )
$found[] = $lineNumber;
}
}
?>
Not the best and really untested but something like that should work. Note: I have never used the function soundex() before in an actual script but I have played around with it and it's really cool. You can also substitue similar_text() or whatever for soundex().
___________________


May 15th, 2004
03:23 AM
i play bass wee.
Status: Offline!
wow, thanks for the quick response. Can you sort of explain what each section does a little becasue I am totally lost. I can figure some stuff out but a few general explanation would really help...thanks again.

May 15th, 2004
03:26 AM
i play bass wee.
Status: Offline!
I think I have figured a few things out
<?
$search_soundex = soundex( $search ); //$search is the keyword being searched for
$fp = file( 'songs.txt' ); //songs file
foreach( $fp as $lineNumber => $text ){ //explode the file into line numbers and store in $text variable
$words = explode( " ", $text ); //seperate into each word
foreach( $words as $iWord ) {//this is where i get lost
if( trim( strtolower( $search ) ) == strtolower( $iWord ) || soundex( $iWord ) == $search_soundex )
$found[] = $lineNumber;
}
}
?>

May 15th, 2004
03:37 AM
$search is the search phrase.
file() reads an entire file into an array.
foreach() goes through an array element by element and $lineNumber is the current array element it's reading. $text is that array's value.
The next foreach goes through each word and if the word.
If a word is found that matches the search phrase or the word's soundex matches the search phrase soundex then it puts the line number into a "found" array.
You should then run an array_unique on the results to make sure there aren't duplicate line numbers.
You can view the results by doing something like:
<?php
foreach( $found as $line ) {
echo $fp[$line]."<br/>";
}
?>
___________________


May 15th, 2004
03:47 AM
i play bass wee.
Status: Offline!
Thanks, it's working jsut great. 

May 15th, 2004
03:50 AM
It's working!! LMAO
I really didn't expect that. Try typing in some things that are _almost_ the same as your search phrase and see if it works. Like "radohead" instead of "radiohead". Tell me if it actually works.
___________________


May 15th, 2004
04:04 AM
i play bass wee.
Status: Offline!
ya, that sounex() function is the ****. This is working great! Thanks for the help.

May 15th, 2004
04:33 AM
with Mr. Jones
Status: Offline!
I was messing with this... it's reasonably accurate, but mess with the percents up top to change boundries. It's a bit faster then the other one...
<?php
define('FILE', 'file.txt');
define('SEARCH_TERM', 'ludo');
define('EXACT', false);
define('MATCH_PERCENT_SHORT', 15);
define('MATCH_PERCENT_LONG', 35);
$lines = file(FILE);
$short_search = explode('-', SEARCH_TERM);
$search['short'] = $short_search[0];
$search['long'] = str_replace('-', '', SEARCH_TERM);
foreach($lines as $number=>$content)
{
similar_text($content,$search['short'], $short_percent);
similar_text($content,$search['long'], $long_percent);
$match = EXACT ? stristr($content, SEARCH_TERM)
stristr($content, SEARCH_TERM) ? true : (($short_percent > MATCH_PERCENT_SHORT) || (($long_percent > MATCH_PERCENT_LONG) && ($short_percent > MATCH_PERCENT_SHORT-10))));
if($match)
{
$matches[] = $content;
}
}
if(isset($matches))
{
print count($matches) . ' matches.<br />';
foreach($matches as $match_number=>$item)
{
print ($match_number+1) . ': ' . $item . '<br />';
}
}
else
{
print 'no matches';
}
?>
___________________
http://www.philbrodeur.com - Expert PHP Development and Tutorials

May 15th, 2004
04:42 AM
Nice script, Phil. But perhaps you should use levenshtein() which is faster and also supports a greater complexity. soundex() is the fastest of the bunch but it performs a different task than similar_text() or levenshtein().
___________________
