Collective Intelligence in Action

Collective Intelligence in Action
by Satnam Alag

Collective Intelligence in Action
List Price: $44.99
Our Price: $20.00
You Save: $24.99 (56%)
Availability: Usually ships in 1-2 business days
Buy Used: from $10.15 (click here)
Category: Book
See more book details and other editions


(Click here)
Buy this book at online book store in your country
Canada | UK | Germany | France

Book Summary Information

Author: Satnam Alag
Edition: Paperback
Audio: English (Unknown); English (Original Language); English (Published)
Published: 2008-11-04
ISBN: 1933988312
Number of pages: 425
Publisher: Manning Publications
Product features:
  • ISBN13: 9781933988313
  • Condition: Used - Very Good
  • Notes: 100% Satisfaction Guarantee. Tracking provided on most orders. Buy with Confidence! Millions of books sold!

Book Reviews of Collective Intelligence in Action

Book Review: Fascinating book about how Web 2.0 sites work.
Summary: 5 Stars

To really understand this book one would probably have to be a Java programmer, which I'm not, but I was able to follow the argumentation. I do have some background with data mining using SAS and SQL and the mathematics described are fairly easy to understand for someone with even a 1st year engineering or applied math background. I also have an interest in linguistics which kept me going.

The basic idea is that one can catalog documents by removing irrelevant words (adjectives, abstract pronouns, conjunctives) and "stemming" the remaining words (ie: reducing "sews", "sewing", "resew", "sewer" to a root "sew") and creating a vector containing each root word and the word frequency and then normalizing it. One simple result is the ability to produce "word clouds". Similarity between documents is measured by taking the dot product of the two vectors. Any document compared to itself would have a dot product of 1. Two documents with no common stem words would have a dot product of zero. Similar docs would have a high value close to 1, say .8. Dissimilar docs would have a low coefficient, say .15. Even mistaking "sewer" (a conduit for waste) and sewer (one who uses a needle and thread) is taken into account because both docs would only be similar on a couple of keywords, and dissimilar on most others.

What's really neat is how this information gets collected and can be applied. Social networking sites, including the one you are reading right now, Amazon.com, collect data on us through our choices. Browse for a book while logged on then that's something you are interested in. Approve a review the words in the review, summary of the book and the title counts towards your interests. Disapprove and that counts against your interests. Write a review and the words you write become part of your cumulative profile as well, reduced to a vector or vectors of keywords and frequencies.

Here's how it gets applied: One of Amazon's marketing tools is it's "recommendation engine". (The book talks about Netflix recommendation engine and business model). By matching your vector against other people who have bought/viewed what you have bought a prediction can be made as to the likelihood of you being interested in the something that they have bought, or not interested in items that they rejected or disliked. The more Amazon caters to what you are interested in, and doesn't bother you with irrelevancies, the happier you may be.

Other applications discussed include the automatic creation of folksonomies (taxonomies based on popular usage) using cluster analysis and categorization using Bayes theorem.

In addition to recommendation engines Alag points out the usefulness of these techniques to Search and points out several search engines that apply this approach (as does Google), tools that search out and provide news based on your preferences, or suggest "friends" (ie: Facebook or eHarmony might use these ideas), search for similar material to identify copyright infringement, email filters that keep out spam for rolex watches or viagra (unless you are interested in rolex watches or viagra), construct a virus detection engine based on code phrases or early detection of epidemics or adverse reactions to medication through similarities in medical reports. Alag himself appears to be working at a biotech firm NextBio that matches public medical and genome related data to data held by private companies.

Some of the basic tools discussed are Lucene, a free version of what Google will sell you for a search engine, Nutch, a free web crawler, both of which require coding and WEKA, a free open source data mining package that looks usable by the rest of us.

Loved the book and the author's organization of the material. Some of the social implications are scary, especially for privacy concerns, but so is the implication of not leveraging the information that one holds within your organization to provide the best possible service. For example the World Bank has the capability (not necessarily using these methods) to match similar projects around the world so that experience gained in one area can be found and applied elsewhere. This is a key fast moving tech that one needs to understand in order to see where we are going as a society. C.I. in Action is merely the opening salvo - the methods and techniques described are the basics but there is much room for refinement and elaboration and this topic could be the start of a whole new field. The book also recommends and has sparked my interest in the site [...] which is probably more accessible to someone without a math or tech background.

Finally a note to SF fans, esp. of Spider Robinson's Callahan's Crosstime Saloon series, this may be the point at which the Web starts to appear to be intelligent. :-)

Summary of Collective Intelligence in Action

There's a great deal of wisdom in a crowd, but how do you listen to a thousand people talking at once? Identifying the wants, needs, and knowledge of internet users can be like listening to a mob.

In the Web 2.0 era, leveraging the collective power of user contributions, interactions, and feedback is the key to market dominance. A new category of powerful programming techniques lets you discover the patterns, inter-relationships, and individual profiles-the collective intelligence--locked in the data people leave behind as they surf websites, post blogs, and interact with other users.

Collective Intelligence in Action is a hands-on guidebook for implementing collective intelligence concepts using Java. It is the first Java-based book to emphasize the underlying algorithms and technical implementation of vital data gathering and mining techniques like analyzing trends, discovering relationships, and making predictions. It provides a pragmatic approach to personalization by combining content-based analysis with collaborative approaches.

This book is for Java developers implementing Collective Intelligence in real, high-use applications. Following a running example in which you harvest and use information from blogs, you learn to develop software that you can embed in your own applications. The code examples are immediately reusable and give the Java developer a working collective intelligence toolkit.

Along the way, you work with, a number of APIs and open-source toolkits including text analysis and search using Lucene, web-crawling using Nutch, and applying machine learning algorithms using WEKA and the Java Data Mining (JDM) standard.

Internet Books

Book Subjects
Most talked about in Internet Books
PHP and MySQL by Example ImagePHP and MySQL by Example
by Ellie Quigley, Marko Gargenta
Prentice Hall; Published: 2006-12-02; Paperback; Book
Best price: $25.99
Price in other shops: $54.99
Navigate the Net: A Comprehensive Learning Experience for Travel Professionals ImageNavigate the Net: A Comprehensive Learning Experience for Travel Professionals
by Shelly M. Houser
Prentice Hall; Published: 2002-05-24; Paperback; Book
Best price: $12.50
Price in other shops: $61.60
Elijah Lovejoy's ASP Training Course (Complete Video Course) ImageElijah Lovejoy's ASP Training Course (Complete Video Course)
by Elijah Lovejoy
Prentice Hall PTR; Published: 2001-12-18; Hardcover; Book
Best price: $66.49
Price in other shops: $69.99
Weaving a Website: Programming in HTML, Java Script, Perl and Java ImageWeaving a Website: Programming in HTML, Java Script, Perl and Java
by Susan Anderson-Freed
Prentice Hall; Published: 2001-08-16; Paperback; Book
Best price: $19.93
Price in other shops: $110.00
Online Resource Guide for Law Enforcement ImageOnline Resource Guide for Law Enforcement
by Timothy M. Dees
Prentice Hall; Published: 2001-06-30; Paperback; Book
Best price: $1.99
Price in other shops: $44.80
Core Servlets and Javaserver Pages: Core Technologies, Vol. 1 (2nd Edition) ImageCore Servlets and Javaserver Pages: Core Technologies, Vol. 1 (2nd Edition)
by Marty Hall, Larry Brown
Prentice Hall; Published: 2003-09-05; Paperback; Book
Best price: $27.75
Price in other shops: $64.99
Publish it on the Web! Windows, Second Edition ImagePublish it on the Web! Windows, Second Edition
by Bryan Pfaffenberger
Academic Press; Published: 1997-08-13; Paperback; Book
Best price: $9.00
Price in other shops: $37.95
Big Book of FYI RFCs (Big Books) ImageBig Book of FYI RFCs (Big Books)
Morgan Kaufmann; Published: 2000-08-15; Paperback; Book
Best price: $8.50
Price in other shops: $34.95
HITTESDORF CORBA/IIOP CLEARLY EXPLAINED (Clearly Explained) ImageHITTESDORF CORBA/ IIOP CLEARLY EXPLAINED (Clearly Explained)
by Michael Hittesdorf
AP Professional; Published: 2000-03-01; Paperback; Book
The Internet Outdoor Family Fun Yellow Pages: The Online Guide to the Best Outdoor Family Sites ImageThe Internet Outdoor Family Fun Yellow Pages: The Online Guide to the Best Outdoor Family Sites
by Jack Sanders
International Marine Publishing; Published: 1999-05-25; Paperback; Book
Best price: $15.56
Price in other shops: $19.95
Similar Books and other products
Data Analysis with Open Source Tools ImageData Analysis with Open Source Tools
by Philipp K. Janert
O'Reilly Media; Published: 2010-11-25; Paperback; Book
Best price: $21.99
Price in other shops: $39.99
Natural Language Processing with Python ImageNatural Language Processing with Python
by Steven Bird, Ewan Klein, Edward Loper
O'Reilly Media; Published: 2009-07-07; Paperback; Book
Best price: $31.45
Price in other shops: $44.99
Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems) ImageData Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems)
by Ian H. Witten, Eibe Frank, Mark A. Hall
Morgan Kaufmann; Published: 2011-01-20; Paperback; Book
Best price: $32.50
Price in other shops: $69.95
Hadoop: The Definitive Guide ImageHadoop: The Definitive Guide
by Tom White
Yahoo Press; Published: 2010-10-12; Paperback; Book
Best price: $26.41
Price in other shops: $49.99
Lucene in Action, Second Edition: Covers Apache Lucene 3.0 ImageLucene in Action, Second Edition: Covers Apache Lucene 3.0
by Michael McCandless, Erik Hatcher, Otis Gospodnetic
Manning Publications; Published: 2010-07-28; Paperback; Book
Best price: $28.20
Price in other shops: $49.99
Hadoop in Action ImageHadoop in Action
by Chuck Lam
Manning Publications; Published: 2010-12-22; Paperback; Book
Best price: $24.50
Price in other shops: $44.99
Mahout in Action ImageMahout in Action
by Sean Owen, Robin Anil, Ted Dunning, Ellen Friedman
Manning Publications; Published: 2011-10-14; Paperback; Book
Best price: $25.57
Price in other shops: $44.99
Algorithms of the Intelligent Web ImageAlgorithms of the Intelligent Web
by Haralambos Marmanis, Dmitry Babenko
Manning Publications; Published: 2009-07-05; Paperback; Book
Best price: $18.99
Price in other shops: $44.99
Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites ImageMining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
by Matthew A. Russell
O'Reilly Media; Published: 2011-02-08; Paperback; Book
Best price: $22.25
Price in other shops: $39.99
Programming Collective Intelligence: Building Smart Web 2.0 Applications ImageProgramming Collective Intelligence: Building Smart Web 2.0 Applications
by Toby Segaran
O'Reilly Media; Published: 2007-08-23; Paperback; Book
Best price: $18.00
Price in other shops: $39.99
Book store. Illustrated catalog of books on different categories