When You are THE MAN

May 22nd, 2009

Our friends have asked us if we were nervous about going out on our own, especially with the economy as it is. Alex and I have wanted to do our own thing for a while. We call this yearning “not wanting to work for The Man.” We, to a large degree, control our own destiny. That means we need to simultaneously market our company, raise capital, develop the product and run an office. It’s only been two weeks and we’ve learned that when you are The Man, you have to wear many hats.

For most people who work in established companies, these tasks are divided and parceled out amongst numerous specialists. One of the chief complaints I heard from the younger engineers at our old company is that they don’t have a good view of the overall business. What I think they mean to say is that they don’t see the interaction of all the components that make a company run. I was an applications engineer, where I would meet customers, help solve their problems, sometimes at their location, define products and characterize them in end-applications. I thought I had a good view of how the business ran. Boy was I wrong.

There’s a lot of “small” organizational tasks that I took for granted. For example, we had a closet full of pencils and notebooks. We had an amazing field engineering team. We had a healthcare plan. I learned that when you are The Man, you need to know when to take advantage of the sale for a 24 port 10/100Mbps switch. You need to implement a successful distribution system and understand how to deal with healthcare for you and your employees. By the way, Microcenter, has a great deal on a 24 port D-Link switch.

I see PR in a whole new light. At our previous large company, I authored a fair number of webinars and articles. I contributed to ad campaigns for product releases. But at a startup, there is this visceral sense of the direct linkage with the bottom line. It’s where the rubber meets the road. Our web page ranking and the press we get will clearly affect how Baydin takes off. We’re working on a kick-ass product but people need to know about it to buy it.

To our friends who have been writing in to ask us how things are going. The protective cocoon has come off and we’re experiencing the internal clockwork of business. We’re beginning to see things as they are.

stever Small Talk ,

Desktop Search: What hasn’t changed

May 20th, 2009

A few posts back, we talked about why search on the desktop works a lot better than it did just a few years ago.  In this post, we’ll talk about how desktop search hasn’t kept up as the way we find and consume content on our computers has changed.

As recently as 2000, the deluge of emails, files, podcasts, blog posts and everything else that we have to keep track of was more like a drizzle.  The average hard drive held about 8 GB of data and we averaged about 7 non-junk emails per day.

As of 2009, those numbers look pretty different.  My laptop’s hard drive is a relatively tiny 160 GB; most computers come with at least 320 GB.  The way we work with email has changed too.  We now average 25 emails per day (almost a whopping 10,000 per year!) thanks to a lot of mailing lists and a lot of CCing.

Of course, we’re not suddenly 60 times more productive than we used to be.  Instead, we just get more of other people’s content.  Before Gmail made email quotas obsolete, CCing large files to everyone who might want a document wasn’t practical.  In 2000, blogs didn’t really exist, and the number of pages that interested each of us on the Internet was orders of magnitude smaller.

The problem only intensifies if we think about it from a corporate perspective.  How many gigabytes of data does your entire company have?  Where does it live?  At our former company, many groups had internal wikis, all of them had internal sharepoint sites (at least three, and as many as fifteen per group!), we had a document management library, we had personal websites with documents attached; everyone cared more about getting the job done than setting it up for other people to have an easy time finding what they created.

So there are now a lot more fragments of information in our brains and a lot more places that the rest of that information could be.  We spend a lot more time asking ourselves “where did I see that again?”  That translates into a lot of time and money. Bill Gates says that the average knowledge worker spends 11 hours a week looking for information, costing his/her company $18,000 per year in lost time.

The future looks like it is going to be even more chaotic – we will not only access more information in more places, but on more devices as well.  We will see some content on our computers, some on our $200 Netbooks, more on our iPhones or BlackBerries, and even more on our Kindles or Sony Readers.  And as we see more content on more devices, remembering where we saw the content we need NOW is going to get even harder.

A lot of productivity gurus are challenging us to “take charge of our Inboxes!” and implement a regimen that will help us manage the information.  But technology caused this problem.  Why isn’t it fixing it?

Fundamentally, the way we look for information hasn’t changed a lick since 2000.  Whether searching our computers or the Internet, we try to figure out what we want to find and we type it into a search box.  We get results that we hope are good enough – they often are.  When programmers have tried to improve on the search box, they’ve come up with some terrifying things.

I’ve attached a screenshot of the MIT Simile Seek project’s implementation of what is called faceted search below.  It’s a programmer’s dream.  I think I am wired to love driving tools like this.  It feels like piloting a starship.  If I know i want the 2nd top level domain to be .mit.edu because I know it came from someone at MIT, but I don’t know which lab, faceted search puts that power right at my fingertips.

simile_seek

But when I showed faceted search to anyone who doesn’t program computers for a living (like Electrical Engineers), they did not share my enthusiasm.  Other search improvements yielded similar gnashing of teeth.  The search box remains the search box.

So we’ve got a lot more content than we’ve ever had before, located in a lot more places than it’s ever been before, and we access it on more devices than we’ve ever used before.  And we still do pretty much the same things to find it that we did in 2000, when we had a lot less content, all on one hard drive, all on one computer.

So there’s a lot to fix.  And we’d love to fix all of it!  But for now, we’re trying to siphon off just one aspect of the problem where we think our technology can make a big difference.  In a few days, we’ll talk more about how we’re going to do it.

Alex Moore Technical

How do you feel about… telepathy?

May 15th, 2009

Whenever I ask friends what superpower they would most want, one person would say “I want to read minds.” This can be a double-edged sword. Baydin will get you more than half-way there. We can’t tell you how just yet but I’m curious to know what people think about being so empowered. There’s also a Baydin page on Facebook if you want to join in on the discussion there.

stever Small Talk

How do you transfer knowledge and experience?

May 7th, 2009

As we start Baydin, we wrap up work at our old jobs. I’ve spent the last three weeks trying to “transfer” everything I’ve learned in the greater part of a decade to my colleagues. It’s tough. 

Knowledge that’s instilled or discovered over time is conveyed quickly without context. I copied my entire hard disk and passed the external drive around so that everyone could store the information. I gave a few presentations and some explanations of typical problems and what to look out for. 

Although I work with some of the smartest, most talented people around, they do not have the full context of each of these projects. I mean, how could they? I worked on it for years. They have had three weeks to ramp up before I leave.  

What we’re working on at Baydin will have huge ramifications on getting up to speed and transferring knowledge. Our software will give people the necessary context to most effectively understand and learn. In addition, it will show only what is relevant. No more. No less. 

I see examples of the need for Baydin everyday. It’s going to be exciting. Stay tuned!

stever Small Talk , ,

A Brief History of Search on the Desktop

May 6th, 2009

Desktop search has come a long way in the past few years.  In this post, we’ll explore how the technology behind all of the major desktop search options has changed based on web search innovations.  In the follow-up posts, we’ll talk a little bit about how desktop search is different from web search and how it has both succeeded and failed at making interacting with our computers better.  We’ll share a few tricks for getting more out of Desktop Search and a few things we wish it could do.  We’ll also share a little bit about how Baydin plans to fill in the gaps.

There are two major advantages to a modern desktop search experience: the first is that searching for a document is a lot faster than it used to be, and the second is that in virtually all file types, the text inside the document is searchable, instead of just the filename.

Think back to the file search in Windows 95.  It was pretty terrible.  All it could do was search for filenames, and it took the better part of eternity to find anything.  Here’s why: when someone searched for a word, Windows opened the file system and looked at every single file it had.  It compared the search query with the filename for each file, and as it found matches, it added the files to the results listing.  Every time a new search started, Windows had to look at every single file, which is why the results trickled in over a period of a few minutes.  If the search term were somewhere in a document or in an email rather than in the filename of a physical file, we were pretty much out of luck.

win95search

Searching the full text of documents was beyond the pale.  To do that, Windows would need to open every single file as it came across them and extract the text.  It would have been slower than slow, it would have required every piece of software that saved any kind of document to provide hooks for Windows to extract the text, and it probably would have made the computer rottenly unstable. 

Searching through email in Office (up until 2003) used the same method, but since every email had a known structure, Outlook could search through the full text of messages.  When a user started searching for something, Outlook opened the most recent email and compared the search terms against each word in that email.  If there was a match, it would add the email to the result list in real-time.  When it finished with the most recent email, it would move on to the next, then to the next, then to the next.  Searching through email was a slow process, but it would eventually yield results where the terms were found only in the text of emails.

A real innovation happened, though, when software developers realized that the same technology that powers web search engines could be applied to the desktop.

When someone clicks the search button on a web search engine, the search engine responds in a totally different way from Windows 95-style search.  Google does not crawl every page on the web, word for word, comparing the search terms for a match.  Instead, Google just looks in a previously-generated database where they already have prepared a list of all the web pages that contain the search term (and a bunch of other information that helps them order the results!)

Instead of sifting through every word ever written on the Internet in real time, Google crawls each page on the web only every few hours, days, or weeks depending on how important a site is and how frequently its content changes.  When Google crawls a site, its crawler looks through every page, processes every term, and updates the database. 

Very crudely, that index looks like this:

Term Results
baydin http://www.baydin.com
http://burmadigest.info/2008/03/20/set-ka-lay-baydin-burmese
http://www.baydin.com/blog
etc.
chicken http://en.wikipedia.org/wiki/Chicken
http://allrecipes.com/Recipes/Chicken
etc.
outlook http://www.microsoft.com/outlook http://en.wikipedia.org/wiki/Microsoft_Outlook
etc…

All Google has to do when you search for “chicken” is find that index and list the results.

Of course, that’s a sweeping simplification – it doesn’t address multiple-term searches, result order, or the fact that the index is HUGE and difficult to maintain.  There are dozens of fantastic papers from Google engineers that explains a lot of the details; try http://labs.google.com/papers for a listing, or start here for an overview from when Sergey and Larry were still at Stanford.  But for the purposes of this post, that’s all we need to worry about. 

Creating and maintaining a mapping from search terms to web pages is the critical innovation for desktop search.  The idea extends quite well to our individual computers.  Instead of a mapping from terms to web pages, though, we need to make a mapping from terms to documents. So the problem is a little bit harder in that we have to be able to index a whale of a lot of document types instead of just HTML, but it is a lot easier in that the index size is nowhere near as large as the index for the web.  It can be generated relatively fast (probably under an hour for the average computer) and does not require a lot of space.

Google Desktop Search, Windows Desktop Search, and all the competitors do exactly this.  Their indexer runs in the background, opens every file on the computer, and creates a database in the same format as the web databases above:

Term Results
baydin C:\Alex\Documents\baydin_biz_plan.doc
C:\Alex\Desktop\blog\post1.html
C:\Alex\Documents\cashflow.xls
etc.
chicken C:\Alex\Documents\Recipes\chicken florentine.doc
C:\Alex\Desktop\chicken.jpg
etc.
outlook C:\Program Files\Microsoft\Outlook.exe
C:\Alex\Documents\problems with outlook.doc
etc…

When I search on my computer for a word, like the web search engines, all my computer now has t
o do is look in that index and find the already-generated list of files that match my term. 

The key takeaway is that thanks to these indexes, searching through the full text of every file on a computer is now thousands of times faster than just searching the filenames used to be. 

Alex Moore Technical , ,

April 11th, 2009

Whoa, no way… someone’s actually reading this?  Sweet!

Over the next few months, we’ll be using this blog to describe the development (both the software and the business!) of Baydin, our first startup.  For the next month or so, we’re still at our day jobs, but starting May 8th, we’re going to be in it full time.

We’ll describe what we’re working on as soon as we get the right words put together.  Google is forever, after all :)   For the moment, let’s just say we’re going to make a plugin that will make Outlook a whole lot more helpful.

Alex Moore Small Talk

Hello world!

March 11th, 2008