*Alas, it’s 2016 and there’s still no serious competitor to Google.* And there w...

hotcool · on Feb 25, 2016

Newcomers might be better off approaching the challenge in a totally different way.

romaniv · on Feb 25, 2016

A newcomer can index, store, search and display data differently, but first they still have to get that data. Which means dealing with JavaScript, paywalls and other fun stuff like that.

Besides, regardless of what you do, you would need to have tons of storage, bandwidth, CPU power and a high-availability infrastructure.

grey-area · on Feb 25, 2016

The vast majority of the data on the internet is dross or spam, if someone finds a way to index only quality content, the crawling burden could be radically smaller.

whitegrape · on Feb 25, 2016

I remember DDG did side-by-side comparisons, but did they ever just query Google themselves and then do their own stuff on top? Anyway I'm not sure Google can be beaten at all by any particular newcomer, instead they might suffer death by a thousand cuts as more niche engines like Shodan show up and bring in context-specific searches that are orders of magnitude better than Google's more complicated all-web-they-approve-of-and-find-relevant-to-your-account search.

vonklaus · on Feb 25, 2016

You are not correct. Apache Foundation, electron/chromium & elastic are the open source components of google. meme-explorer is the paradigm but was funded by drape/jpl_nasa so work was suspended on it.

heirarchical crawl index and DNS rebuild(or similar model) will lead to search platform and fix discovery, monopoly and monetization.

note that Google has solved a HUGE problem, but their task is now impossible. You can't simply use a textbook and some booleans and quotations(which I am not sure they even respect) to deliver results for a billion people.

Individuals and the market will calibrate their own results.

coolsunglasses · on Feb 25, 2016

I maintain an Elasticsearch client and have used it for work quite a bit, but let me just say

>elastic

hahahahahahaa

you have not used Elasticsearch I see :)

ErikAugust · on Feb 25, 2016

You are perpetuating a myth for the sake of a monopoly.

A novel solution could be designed and implemented by a small company but no one dares.

adevine · on Feb 25, 2016

If a small company could design a credible alternative to Google, I'm sure it would have been done - the incentives are too great.

Entrenched businesses are RARELY displaced from their top position. Instead, what usually happens is the world changes around them, and the entrenched business is ill suited to compete in the new world. Nobody ever managed to seriously challenge Microsoft for desktop/laptop OS dominance: https://en.wikipedia.org/wiki/Usage_share_of_operating_syste.... The issue is that desktop/laptop OS dominance is no longer as important as it once was.

vonklaus · on Feb 25, 2016

i started working on a model, i discovered memex-explorer after and they had apparently designed a slightly less decentralized model but had already implemented it however the project was suspended indefinitely.

memex-explorer + blockchain is the model. A company can not do it. A company can design a platform, and many companies can sell on the optimization & information marketplace to fix problem.

swe · on Feb 25, 2016

Search, or the quest for knowledge, is not going to be displaced or marginalized. If a small company builds true AI, Google is done.

hexagonc · on Feb 25, 2016

Depends on how they arrive at it. If they create true AI with lots of spare capacity then maybe Google is threatened, but what seems more likely to me is that true AI is achieved in some hodgepodge system that wasn't well architected to scale. An example of this would be a true AI that came about as a result of embodiment in a robot. Yes, it is true AI but that doesn't mean it knows how to read, let alone read and understand billions of webpages. Even if it could read a page of general text at the level of an average adult, that doesn't mean it can read it quickly.

[EDIT] And as an addendum to this fun tangent, even if the AI was capable, in principle, of reading billions of webpages fast and well, we don't know the power requirements would be for this. This hypothetical small company may have stumbled upon the right algorithms and the right training data to produce a true AGI, but they may simply not have the hardware or the engineering know-how to scale it up across multiple processors. Or scaling it up may require too much (i.e., more than the company can afford) data bandwidth if the robot is controlled. Again, it depends on the details of how they arrived at the AGI.

adevine · on Feb 26, 2016

And, more likely, if a small company does come up with a great AI, they're going to get bought be Google, not compete. Reason being you are still going to need the sheer hardware capacity (a LOT of it) to process so much data. I remember a good article about how YouTube would have likely hit serious trouble had Google not bought them. YouTube was starting to crumble under the exponential growth in load, and Google was perfectly matched to provide infrastructure support.

webmasterraj · on Feb 25, 2016

Maybe. Search is a resource-intensive algorithmic problem. So you need one of two things to beat Google: more resources or a much better algorithm.

You're not going to get the first unless you're Facebook or Amazon or God, but maybe you can build a smarter algorithm. You are up against an army of some of the smartest computer scientists and mathematicians ever assembled -- but what you have going for you is a complete lack of inertia or legacy. You could try crazy things that Google might not, because they won't think it'll work. If you get lucky, one of those blows up. But you have to get very lucky (this is the Innovator's Dilemma in a nutshell).

Anyone want to give it a shot?

jib · on Feb 25, 2016

You can also niche your space. Hoogle would be an example of that. If you know the searcher cares about Haskell functions only I imagine you can beat Google in that space. That solution probably expands to other interest spheres.

dragandj · on Feb 25, 2016

Additionally, you'll still have to find a way to make money. Even if you manage to get on par with Google's search results, it is difficult to replicate their advertising cash cow, and even more difficult to invent a completely new monetization strategy and make it successful.

vmorgulis · on Feb 25, 2016

> ..more resources or a much better algorithm.

Or a limited search space. For myself, it could be HN, SO and Wikipedia.

Content can even be static and downloaded once or regularly.

chillingeffect · on Feb 25, 2016

Use duckduckgo as a first line. Use yahoo and google as backups. Eventually DDG will get enough funding to improve their quality. That's what I do.

vonklaus · on Feb 25, 2016

I believe the model will be a programatic hierarchical lookup:

local cache ===> personal cloud cache ===> centralized crawl repo ===> small specialized data cache sold into marketplace ===> faiil

vmorgulis · on Feb 25, 2016

There is Yacy (a P2P search engine):

http://yacy.net/