BLADAM 2.0[?]: Life, Liberty, Love and Stuff
DISCLAIMER: This is my personal blog. The blatherings here aren't (necessarily) the views of the current company I work for, companies I've previously blessed with my presence, my loving parents, the Illuminati, or anyone other than me, me, me!

I just figured out the search engines’ next foray!


I know, I know, some of you are probably sick of me idly speculating on what Google, Yahoo!, and Microsoft are going to do next, but I just had yet another vision that I wanted to share with you.

One of the search engines is going to build or buy a leading OCR and/or photo scanning software package.Why?

Well, just do the plotline in your head. Google just built a system (Google Base) which -- if, perhaps rather inelegantly -- lets people add additional content in bulk for that search engine to slurp up.

Google and (separately) the Open Content Alliance are busy scanning the world's books.

So we have Web pages, music, images, scholarly research, books, and more being indexed... but what about all those zillions of papers folks have laying around? Like the ones I just set about scanning this evening to reduce some of the clutter around my desk.

What have I been scanning? A list of waltz moves, an e-mail directory, a memorable schedule of a recent dance camp I attended, and a funny article I wrote for my high school newspaper.

How much of this would the world be interested in? How much would I really WANT to share? Not all of it, to be sure.

But from older academic papers to newspaper clippings to home photos and more... there's a TON of information out there that's not digitized.

Not digitized yet, that is.

And interestingly enough, decent scanners (albeit not slide scanners) are pretty darn cheap ($50 or less, especially used ones on ebay). But really good OCR software? At least $150, from what I've gathered. Students, families, home-office professionals... I bet most of them have scanners. But I doubt most of them have OCR software.

Then again, perhaps the search engines could simply piggyback onto non-OCR scanning software and do the OCR on their supercomputers inhouse. Greater ability to iterate, do A|B testing on scan quality, etc., without depending upon users to update software.

* * *

Benefit to engines:
  • A huge database to improve NLP (natural language processing) algorithms... better understanding the interplay of text, graphs, photos, etc.
  • Access to a ton of new content
  • Further enticement to consumers to get onto their desktops (e.g., perhaps bundled in with Google Desktop or MSN Search or Yahoo-X1 search, etc.)


Benefit to consumers:
  • Ability to archive documents and/or photos online with greater accuracy, and for less money (even free) for personal retrieval.
  • Easier way to share not-yet-digitized documents with colleagues, using an OCR'd (much less bandwidth intensive) format
  • Probably other stuff I'm overlooking


* * *

What are your thoughts on this?

1) How feasible do you think it is that one of the search engines will buy/build such a service?
2) Which search engine'd do this first?
3) How useful would it actually be to general consumers? Small business folks? Others?
 

- Blathered by Adam on Sunday, November 27, 2005 at 23:21 [ Permalink | Trackback ]
- Filed under Geekery
- Commented on by 3 folks so far. Scroll down and see for yourself (and join in the conversation!)


*cough* Riya buyout rumors *cough*

- Posted on Monday, November 28, 2005 at 2:53 [ Permalink to this comment ]

Heh heh, J, well, one of the head advisors of Riya specifically denied that Google was buyin’ them.  But admittedly that leaves Microsoft and Yahoo.

Or......... for the wacky-but-could-happen-idea… how ‘bout AOL buying Riya wink

- Posted on Monday, November 28, 2005 at 2:57 [ Permalink to this comment ]

I feel sorry for the people who has to scan all those smile

- Posted on Wednesday, November 30, 2005 at 23:38 [ Permalink to this comment ]

Say something! If you're a member, log in first, though :-).
Name (Required)   Email (Optional)
Please include at least your first name or a nickname Only I, Adam, will see this. It's not posted!

Location or affiliation (Optional) URL (Optional)
e.g., Mannheim, Germany or Microsoft
or Seattle Savoy Club, etc.

Tags supported: <a href>, <b>, <i>, <u>, <em>, <strike>, <strong>, <pre>, <code>, and <blockquote>
Pop-up smiley chooser

I warmly welcome you to register and reserve your nickname.

While non-members are welcome to comment freely here, members get nifty benefits and more prominently displayed comments.

Signing up is free, super-fast and painless. No rude invasive questions or evil spam!

Register now or learn more :)

Remember my personal information (it'll be pre-filled for future comments here)
Send me an e-mail when others comment

IMPORTANT: Please type in the word adam below:
(This is to thwart automated spambots; sorry for the inconvenience)


A live preview of your comments will appear below as you type (including supported HTML but not showing graphical-smileys).

Next entry: Fun music clip - Aunt Sue's Ant Soup
Previous entry: Click-to-call is the next big thing in Web advertising... but with a twist
FEEDS: Full-text, all categories:
Add to your My Yahoo! page Subscribe with Bloglines title= title= Subscribe with Pluck RSS reader Subscribe in Rojo Add to Google
(See a complete list of category-specific and other BLADAM feeds!)
CREDITS:Site powered by ExpressionEngine. Cool menus by the Ultimate Dropdown Menu. Thoughtful advice and assistance from Ingmar, LisaJill, other awesome EE forum volunteers, and nice friends.
COPYRIGHT: My sites are the result of many hours of hard work. Kindly ask before using my content. Thanks! :)
[ Return to the top of the page]