新书推介:《语义网技术体系》
作者:瞿裕忠,胡伟,程龚
   XML论坛     W3CHINA.ORG讨论区     计算机科学论坛     SOAChina论坛     Blog     开放翻译计划     新浪微博  
 
  • 首页
  • 登录
  • 注册
  • 软件下载
  • 资料下载
  • 核心成员
  • 帮助
  •   Add to Google

    >> 最新的技术动态
    [返回] 中文XML论坛 - 专业的XML技术讨论区休息区『 最新动态 & 业界新闻 』 → Hakia Takes On Google With Semantic Technologies 查看新帖用户列表

      发表一个新主题  发表一个新投票  回复主题  (订阅本版) 您是本帖的第 19815 个阅读者浏览上一篇主题  刷新本主题   树形显示贴子 浏览下一篇主题
     * 贴子主题: Hakia Takes On Google With Semantic Technologies 举报  打印  推荐  IE收藏夹 
       本主题类别: 信息检索 | Semantic Web    
     admin 帅哥哟,离线,有人找我吗?
      
      
      
      威望:9
      头衔:W3China站长
      等级:计算机硕士学位(管理员)
      文章:5255
      积分:18406
      门派:W3CHINA.ORG
      注册:2003/10/5

    姓名:(无权查看)
    城市:(无权查看)
    院校:(无权查看)
    给admin发送一个短消息 把admin加入好友 查看admin的个人资料 搜索admin在『 最新动态 & 业界新闻 』的所有贴子 点击这里发送电邮给admin  访问admin的主页 引用回复这个贴子 回复这个贴子 查看admin的博客楼主
    发贴心情 Hakia Takes On Google With Semantic Technologies


    http://www.readwriteweb.com/archives/hakia_takes_on_google_semantic_search.php

    Hakia Takes On Google With Semantic Technologies

    Written by Richard MacManus / March 23, 2007 /
    This week I spoke to Hakia founder and CEO Dr. Riza C. Berkan and COO Melek Pulatkonak. Hakia is one of the more promising [URL=http://www.readwriteweb.com/archives/top_100_alternative_search_engines_feb07.php]Alt Search Engines[/URL] around, with a focus on natural language processing methods to try and deliver 'meaningful' search results. Alex Iskold [URL=http://www.readwriteweb.com/archives/hakia_meaning-based_search.php]profiled Hakia[/URL] for R/WW at the beginning of December and he concluded, after a number of search experiments, that Hakia was intriguing - but it was not a level to compete with Google yet. It is important to note that Hakia is a relatively early beta product and is still in development. But given the speed of Internet time, 3.5 months is probably a good time to check back and see how Hakia is progressing...

    What is Hakia?
    Riza and Melek firstly told me what makes Hakia different from Google. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. Riza and Melek told me that the future of search engines will go beyond keyword analysis - search engines will talk back to you and in effect become your search assistant.

    One point worth noting here is that, currently, Hakia still has some human post-editing going on - so it isn't 100% computer powered at this point.

    Hakia has two main [URL=http://www.hakia.com/technology.html]technologies按此在新窗口浏览图片[/URL]:

    1) QDEX Infrastructure (which stands for Query Detection and Extraction)  - this does the heavy lifting of analyzing search queries at a sentence level.

    2) SemanticRank Algorithm - this is essentially the science they use, made up of ontological semantics that relate concepts to each other.

    If you're interested in the tech aspects, also check out [URL=http://labs.hakia.com/hakia-lab.html]hakia-Lab按此在新窗口浏览图片[/URL] - which features their latest technology R&D.

    按此在新窗口浏览图片

    How is Hakia different from Ask.com?
    Hakia most reminds me of Ask.com, which uses more a natural language approach than the other big search engines ('ask' a question, get an answer) - and also Ask.com uses human editing too, as with Hakia. [I [URL=http://www.readwriteweb.com/archives/ask_what_differentiates_them_from_google.php]interviewed Ask.com[/URL] back in November]. So I asked Riza and Melek what is the difference between Hakia and Ask.com?

    Riza told me that Ask.com is an indexing search engine and it has no semantic analysis. Going one step below, he says to look at the basis of their results. Ask.com bolds keywords (i.e. it works at a keywords level), whereas Riza said that Hakia understands the sentence. He also said that Ask.com categories are not meaning-based - they are "canned or prefixed". Hakia, he said, understands the semantic relationships.

    Hakia vs Google
    I next referred Riza and Melek to [URL=http://www.readwriteweb.com/archives/interview_with_matt_cutts_next_generation_search.php]Read/WriteWeb's interview with Matt Cutts of Google[/URL], in which Matt told me that Google is essentially already using semantic technologies, because the sheer amount of data that Google has "really does help us understand the meanings of words and synonyms". Riza's view on that is that Google works with popularity algorithms and so it can "never have enough statistical material to handle the Long Tail". He says a search engine has to understand the language, in order to properly serve the Long Tail.

    Moreover, Hakia's view is that the vastness of data that Google has doesn't solve the semantic problem - Riza and Melek think there needs to be that semantic connection present.

    Their bigger claim though is that the big search companies are still thinking within an indexing framework (personalization etc). Hakia thinks that indexing has plateaued and that semantic technologies will take over for the next generation of search. They say that semantic technologies allow you to analyze content, which they think is 'outside the box' of what the big search companies are doing. Riza admitted that it was possible Google was investigating semantic technologies, behind closed doors. Nevertheless, he was adamant that the future is understanding info, not merely finding it - which he said is a very difficult problem to solve, but it's Hakia's mission.

    Semantic web and Tim Berners-Lee
    按此在新窗口浏览图片

    Throughout the interview, I noticed the word "semantic" was being used a lot - but their interpretation seemed to be different to that of Tim Berners-Lee, whose notion of a Semantic Web is generally what Web people think about when uttering the 'S' word. Riza confirmed that their concept of semantic technology is indeed different. He said that Tim Berners-Lee is banking on certain standards being accepted by web authors and writers - which Riza said is "such a big assumption to start this technology". He said that it forces people to be linguists, which is not a common skill.

    Furthermore, Riza told me that Berners-Lee's Semantic Web is about "imposing a structure that assumes people will obey [and] follow". He said that the "entire Semantic Web concept relies on utilizing semantic tagging, or labeling, which requires people to know it." Hakia, he said, doesn't depend on such structures. Hakia is all about analyzing the normal language of people - so a web author "doesn't need to mess with that".

    Competitors
    Apart from Google and the other big 'indexing' search engines, Hakia is competing against other semantic search engines like [URL=http://www.powerset.com/]Powerset按此在新窗口浏览图片[/URL] and hybrids like [URL=http://www.wikia.com/]Wikia按此在新窗口浏览图片[/URL]. Perhaps also [URL=http://radar.oreilly.com/archives/2007/03/freebase_will_p_1.html]Freebase按此在新窗口浏览图片[/URL] - although Riza thinks the latter may be "old semantic web" (but he says there's not enough information about it to say for sure).

    Conclusion
    Hakia plans to launch its version 1.0 (i.e. get out of beta) by the end of 2007. As of now my assessment is the same as Alex's was in December - it's a very promising, but as yet largely unproven, technology.

    I also suspect that Google is much more advanced in search technology than Mountain View is letting on. We know that Google's scale is a huge advantage, but their experiments with things like personalization and structured data (Google Base) show me that Google is also well aware of the need to implement next-generation search technologies. Also, as Riza noted during the interview, who knows what Google is doing behind closed doors.

    Will semantic technologies and 'sentence analysis' be the next wave of search? It seems very plausible. So with a bit more development, Hakia could well become compelling to a mass market. Therefore how and when Google responds to Hakia will be something to watch carefully.

    [此贴子已经被作者于2007-3-28 19:07:59编辑过]

       收藏   分享  
    顶(0)
      




    ----------------------------------------------

    -----------------------------------------------

    第十二章第一节《用ROR创建面向资源的服务》
    第十二章第二节《用Restlet创建面向资源的服务》
    第三章《REST式服务有什么不同》
    InfoQ SOA首席编辑胡键评《RESTful Web Services中文版》
    [InfoQ文章]解答有关REST的十点疑惑

    点击查看用户来源及管理<br>发贴IP:*.*.*.* 2007/3/27 9:53:00
     
     qxr777 帅哥哟,离线,有人找我吗?
      
      
      等级:大二(研究汇编)
      文章:28
      积分:266
      门派:XML.ORG.CN
      注册:2005/5/12

    姓名:(无权查看)
    城市:(无权查看)
    院校:(无权查看)
    给qxr777发送一个短消息 把qxr777加入好友 查看qxr777的个人资料 搜索qxr777在『 最新动态 & 业界新闻 』的所有贴子 引用回复这个贴子 回复这个贴子 查看qxr777的博客2
    发贴心情 
    看来 Hakia 真滴很强大,很值得期待
    点击查看用户来源及管理<br>发贴IP:*.*.*.* 2007/3/28 13:45:00
     
     GoogleAdSense
      
      
      等级:大一新生
      文章:1
      积分:50
      门派:无门无派
      院校:未填写
      注册:2007-01-01
    给Google AdSense发送一个短消息 把Google AdSense加入好友 查看Google AdSense的个人资料 搜索Google AdSense在『 最新动态 & 业界新闻 』的所有贴子 访问Google AdSense的主页 引用回复这个贴子 回复这个贴子 查看Google AdSense的博客广告
    2024/5/2 15:11:29

    本主题贴数2,分页: [1]

    管理选项修改tag | 锁定 | 解锁 | 提升 | 删除 | 移动 | 固顶 | 总固顶 | 奖励 | 惩罚 | 发布公告
    W3C Contributing Supporter! W 3 C h i n a ( since 2003 ) 旗 下 站 点
    苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》
    62.500ms