Archive for October, 2008

Creating an Autotagger with Yahoo’s Term Extraction Service and YUI

Wednesday, October 22nd, 2008

So lets talk about tags…If you are an editor of a blog, photos, or even bookmarks these days you know all about tagging.  Incase you don’t what they are you should read up on them because you are missing a great thing on the internet (http://en.wikipedia.org/wiki/Tag_(metadata)).

It got me to thinking (scary I know) that perhaps it is a pain in the ass as a content developer to figure out what your tags are.  I mean common how do I know what people are searching for?  Isn’t there a better way?  Well perhaps there is.  Perhaps we can tie into some kind of a powerful open search api like what Yahoo has to tell us what the popular terms are in my article and to build me some tags off of it.

Hmm that seems to easy right?  Well guess what it is! Just look for yourself:

Test out the autotagger for yourself

So how the heck do you do that? Well simple just follow along:

The first thing you need to do is go out to http://developer.yahoo.com and register an appid this will allow you 5000 searches a day on their open ids’ which is more then enough for personal use.  You then need to look over Yahoo’s Term Extraction service at: http://developer.yahoo.com/search/content/V2/termExtraction.html to see what is required.

Also for fun lets put in a rich text editor because well lets be user friendly and realistic about an environment.  You can view what is required for that at: http://developer.yahoo.com/yui/editor/

So lets write some code for our Front End:

  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
  2. <html>
  3. <head>
  4. <title>RTE Autotagger</title>
  5. <!-- Skin CSS file -->
  6. <link rel="stylesheet" type="text/css" href="http://yui.yahooapis.com/2.6.0/build/assets/skins/sam/skin.css">
  7. </head>
  8. <body class="yui-skin-sam">
  9. <h2>RTE Autotagger</h2>
  10. <form id="rtepost" method="post">
  11. <br>
  12. <textarea id="editor" name="testrte"></textarea>
  13. <br>
  14. Tags:
  15. <input type="text" id="mytags" size="50">
  16. <input type="button" value="Guess My Tags" id="mytagbtn">
  17. <img src="http://l.yimg.com/jn/images/20081008192436/ajax-loader2.gif" height="20" width="20" id="loading" style="display:none;">
  18. <!--<input type="submit" />-->
  19. </form>
  20. <!-- Utility Dependencies -->
  21. <script type="text/javascript" src="http://yui.yahooapis.com/2.6.0/build/yahoo-dom-event/yahoo-dom-event.js"></script>
  22. <script type="text/javascript" src="http://yui.yahooapis.com/2.6.0/build/element/element-beta-min.js"></script>
  23. <!-- Needed for Menus, Buttons and Overlays used in the Toolbar -->
  24. <script src="http://yui.yahooapis.com/2.6.0/build/container/container_core-min.js"></script>
  25. <script src="http://yui.yahooapis.com/2.6.0/build/menu/menu-min.js"></script>
  26. <script src="http://yui.yahooapis.com/2.6.0/build/button/button-min.js"></script>
  27. <!-- Source file for Rich Text Editor-->
  28. <script src="http://yui.yahooapis.com/2.6.0/build/editor/editor-min.js"></script>
  29. <!-- scouce for connection manager -->
  30. <script src="http://yui.yahooapis.com/2.6.0/build/connection/connection-min.js"></script>
  31. <script>
  32. (function() {
  33. var Dom = YAHOO.util.Dom,
  34. Event = YAHOO.util.Event,
  35. Lang = YAHOO.lang,
  36. Connect = YAHOO.util.Connect;
  37. var myEditor = new YAHOO.widget.Editor('editor', {
  38. height: '300px',
  39. width: '522px',
  40. dompath: true, //Turns on the bar at the bottom
  41. animate: true //Animates the opening, closing and moving of Editor windows
  42. });
  43. myEditor.render();
  44. var generateTags = function(o){
  45. var mydata = eval('(' + o.responseText + ')');
  46. //drop in tags from json object
  47. Dom.get('mytags').value=mydata;
  48. //repace all spaces, this is where you do could other filtering or you could drop in a dash between tags
  49. Dom.get('mytags').value=Dom.get('mytags').value.replace(/ /g,'');
  50. //turn off progress spinner
  51. Dom.get('loading').setAttribute('style','display:none;');
  52. }
  53. //add event to tags button
  54. Event.on('mytagbtn','click', function(){
  55. //turn on progress spinner
  56. Dom.get('loading').setAttribute('style','');
  57. //grab content of the rte window and strip out html
  58. myContent='myContent='+myEditor._getDoc().body.innerHTML.replace(/(<([^>]+)>)/ig,"");
  59. //make ajax call to our simple proxy
  60. myTagsCnt=Connect.asyncRequest('POST', 'make_tags_api.php', {
  61. success: generateTags,
  62. failure: function() {},
  63. scope: this
  64. }, myContent);
  65. });
  66. })();
  67. </script>
  68. </body>
  69. </html>

Now lets write our simple proxy api:

  1. <?php
  2. //might want to add some security here to make sure only you are hitting your api ;)
  3. //set your type as doctype as json
  4. header("Content-Type:application/json");
  5. //create curl function to do a simple proxy for yahoo search
  6. function getContextResource($url){
  7. $ch = curl_init();
  8. curl_setopt($ch, CURLOPT_URL, $url);
  9. curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  10. $result = curl_exec($ch);
  11. curl_close($ch);
  12. return $result;
  13. }
  14. //pull your posted content from Post
  15. $myContent=$_POST['myContent'];
  16. //create url to curl, add in your appid, output, content to check and then urlencode and utf8encode your content
  17. $contextUrl = 'http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction?appid=yourapid&output=json&context='.urlencode(utf8_encode($myContent));
  18. //create curl call from our url
  19. $feed = getContextResource($contextUrl);
  20. //convert to array so we can cut out the things we don't need
  21. $convertToArray = json_decode($feed, true);
  22. //only report back the values we need
  23. $cleanedArray = $convertToArray['ResultSet']['Result'];
  24. //return back a json encoded copy of the array we just cleaned up
  25. echo json_encode($cleanedArray);
  26. ?>

Wow after all of that what do we get?  Well if I copy a well written article from Yahoo Finance such as:
http://biz.yahoo.com/ap/081022/financial_meltdown.html

I get the following tags back:
treasurysecretaryhenrypaulson,apeconomics,henrypaulson,martincrutsinger, aggressivesteps,bushadministration,ysm,financialcrisis,recession,infinity, advertisement,economy,yahoo

Hmm some those are really not bad at all for just a simple search api huh.

Shine Gallery Uploader - A Talk at the Yahoo F2E Summit 08

Thursday, October 9th, 2008

I got the opportunity to speak at the Yahoo F2E Summit this year to discuss Shine’s User Generated Gallery Tool. I presented with Gamaiel Zavala a colleague of mine who I worked with to create the tool. He spoke about the prototype and how to extend YUI and I spoke in regards to extending the prototype in a production environment.

You can check out the tool at:

http://shine.yahoo.com/write (click the gallery button)

I hope everyone enjoyed the presentation and I believe we all had a great laugh at the expense of my colleagues’ uploaded photographs:

You can see my pictures of the Summit at:

http://flickr.com/photos/14261072@N08/tags/f2esummit08/

You can see everyone’s at:

http://flickr.com/photos/tags/f2esummit08/

The lectures ranged, but I found the best to be Zakas’ talk on JS error handling, and the talk on JavaScript bubbling. Everyone did a great job and it was a wonderful Front End community event within Yahoo!