Wednesday 2 June 2010

Unix For Dummies

[simpleaffiliate source="chitika" results="0"][/simpleaffiliate]
unix for dummies
Can you help me with a "UNIX for Dummies" type question regarding a RegExp query?

All I want to do is search for an EXACT PHRASE in zipped files in a UNIX database, and have the querry return all instances of the exact phrase. (UNIX instruction manuals are too confusing!)

So, if I wanted to search a bunch of zipped files in a UNIX database for files containing the phrase
"the more details you provide", case insensitive - what exactly would I enter? I've used the GREP commands before, but don't really know what I'm doing. I'll really appreciate your help!
Thanks, Nicey and Publicident! if the files to search were UNzipped - what would the syntax be?

(Its a long complicated story as to why I haven't just bit the bullet and asked a coworker)

And yes, Mypublicident, I'll check out that link you suggested.

Anyone other suggestions for UNIX guides for the "mentally challenged" would be great. Thanks!


What you are asking for, regex searching in blobs is tough enough, but to add the fact that the files in the blobs are zipped really adds a wrench into the mix.

Ideally, you might index the files before they are zipped. Then though an olap cube you could wrench the data out that you need. If this isn't feasable, then the solution depends on your database. For example, in oracle, you can use java to extract the files, unzip to memory, then run regex queries on each file. Expect your queries to be mighty slow and processor intensive.

Worst case scenario you select a single blob at a time, unzip it, run grep -e or other regex utility on it, store the results, then move on to the next file. Something like this will take quite a while to run.

My suggestion is to work out a way to index the files based on what you think the regex expressions would be. Then run your queries against the indexed data for quick results.

I'm going to hate myself for saying this, but check out Kimball's book on data wharehousing. A bit more info than you'll need, but it'll provide a decent start for you.


Linux? EP 033 : Part 4 : Linux for Dummies?









[simpleaffiliate source="amazon" results="10"]unix for dummies[/simpleaffiliate]
[simpleaffiliate source="cj" results="10"]unix for dummies[/simpleaffiliate]
[simpleaffiliate source="clickbank" results="10"]unix for dummies[/simpleaffiliate]

No comments:

Post a Comment