Alas, it’s 2016 and there’s still no serious competitor to Google.
And there will not be any time soon. Writing an efficient crawler for what we call the "modern" web is not something a small or even median-size company can pull off. Google enjoys a tremendous competitive advantage: people specifically optimize webpages for what it can and cannot do. So any newcomer to the field will have to replicate tons of technologies Google had years to perfect (in addition to solving problems like storage, search logic and bandwidth management).
A newcomer can index, store, search and display data differently, but first they still have to get that data. Which means dealing with JavaScript, paywalls and other fun stuff like that.
Besides, regardless of what you do, you would need to have tons of storage, bandwidth, CPU power and a high-availability infrastructure.
The vast majority of the data on the internet is dross or spam, if someone finds a way to index only quality content, the crawling burden could be radically smaller.
I remember DDG did side-by-side comparisons, but did they ever just query Google themselves and then do their own stuff on top? Anyway I'm not sure Google can be beaten at all by any particular newcomer, instead they might suffer death by a thousand cuts as more niche engines like Shodan show up and bring in context-specific searches that are orders of magnitude better than Google's more complicated all-web-they-approve-of-and-find-relevant-to-your-account search.
You are not correct. Apache Foundation, electron/chromium & elastic are the open source components of google. meme-explorer is the paradigm but was funded by drape/jpl_nasa so work was suspended on it.
heirarchical crawl index and DNS rebuild(or similar model) will lead to search platform and fix discovery, monopoly and monetization.
note that Google has solved a HUGE problem, but their task is now impossible. You can't simply use a textbook and some booleans and quotations(which I am not sure they even respect) to deliver results for a billion people.
Individuals and the market will calibrate their own results.
If a small company could design a credible alternative to Google, I'm sure it would have been done - the incentives are too great.
Entrenched businesses are RARELY displaced from their top position. Instead, what usually happens is the world changes around them, and the entrenched business is ill suited to compete in the new world. Nobody ever managed to seriously challenge Microsoft for desktop/laptop OS dominance: https://en.wikipedia.org/wiki/Usage_share_of_operating_syste.... The issue is that desktop/laptop OS dominance is no longer as important as it once was.
i started working on a model, i discovered memex-explorer after and they had apparently designed a slightly less decentralized model but had already implemented it however the project was suspended indefinitely.
memex-explorer + blockchain is the model. A company can not do it. A company can design a platform, and many companies can sell on the optimization & information marketplace to fix problem.
Depends on how they arrive at it. If they create true AI with lots of spare capacity then maybe Google is threatened, but what seems more likely to me is that true AI is achieved in some hodgepodge system that wasn't well architected to scale. An example of this would be a true AI that came about as a result of embodiment in a robot. Yes, it is true AI but that doesn't mean it knows how to read, let alone read and understand billions of webpages. Even if it could read a page of general text at the level of an average adult, that doesn't mean it can read it quickly.
[EDIT]
And as an addendum to this fun tangent, even if the AI was capable, in principle, of reading billions of webpages fast and well, we don't know the power requirements would be for this. This hypothetical small company may have stumbled upon the right algorithms and the right training data to produce a true AGI, but they may simply not have the hardware or the engineering know-how to scale it up across multiple processors. Or scaling it up may require too much (i.e., more than the company can afford) data bandwidth if the robot is controlled. Again, it depends on the details of how they arrived at the AGI.
And, more likely, if a small company does come up with a great AI, they're going to get bought be Google, not compete. Reason being you are still going to need the sheer hardware capacity (a LOT of it) to process so much data. I remember a good article about how YouTube would have likely hit serious trouble had Google not bought them. YouTube was starting to crumble under the exponential growth in load, and Google was perfectly matched to provide infrastructure support.
Maybe. Search is a resource-intensive algorithmic problem. So you need one of two things to beat Google: more resources or a much better algorithm.
You're not going to get the first unless you're Facebook or Amazon or God, but maybe you can build a smarter algorithm. You are up against an army of some of the smartest computer scientists and mathematicians ever assembled -- but what you have going for you is a complete lack of inertia or legacy. You could try crazy things that Google might not, because they won't think it'll work. If you get lucky, one of those blows up. But you have to get very lucky (this is the Innovator's Dilemma in a nutshell).
You can also niche your space. Hoogle would be an example of that. If you know the searcher cares about Haskell functions only I imagine you can beat Google in that space. That solution probably expands to other interest spheres.
Additionally, you'll still have to find a way to make money. Even if you manage to get on par with Google's search results, it is difficult to replicate their advertising cash cow, and even more difficult to invent a completely new monetization strategy and make it successful.
And there will not be any time soon. Writing an efficient crawler for what we call the "modern" web is not something a small or even median-size company can pull off. Google enjoys a tremendous competitive advantage: people specifically optimize webpages for what it can and cannot do. So any newcomer to the field will have to replicate tons of technologies Google had years to perfect (in addition to solving problems like storage, search logic and bandwidth management).