Nominator INAICTA 2008 - BuddyMiner:A greate tool to analyse email

Topik: Announcement

by Budi Susanto Duta Wacana Christian University Yogyakarta, Indonesia

Background When our email collection become a pile of documents, in many cases we need to organize and analyze it to give us information about it. Therefore, we use an email client, like Thunderbird or Microsoft Outlook that serve us tools for basic functions: sending, retrieving, organizing our emails, and spam detection. Those tools are not for analyzing email collection in large size. In order to analyze them, we need a special software tool that functions as the analyzer tool which will provide information about the email collections. when we analyzing emails we know who communicate with us, what and how many groups formed based on emails frequency and what most interchangeable information among participants, and what topics that mostly discuss.

Based on that requirements and problems, I have developed a unique tool that have many features, specially to analyze a large document collections. I call this application as BuddyMiner. The first version of BuddyMiner is restricted to read mbox file format (a stored Mozilla email format). With BuddyMiner, we will be helped to find some pattern of information, email automatic clustering, some statistic graphics of email collection, information retrieval for the collection, etc.

Features BuddyMiner is developed based on text mining clustering, information retrieval and information extraction theory. With this approach make BuddyMiner as a special application for any organization to help theme finding some hidden pattern information in their email collection. BuddyMiner is designed to analyze Indonesian and English documents. Picture 1 give us an illustration about the main interface of BuddyMiner.

buddyminer.jpgThere are many features that already provided by BuddyMiner. Here are some of those features: - provide some information pattern matching for each email item, like URI address, date/time, attachment, and phone number;

  • provide tree structure to visualize email thread;

  • provide social information for any user clique and visualize it in a graph;

  • provide information retrieval to make easier for user to find any emails based on the keyword of search;

  • clustering building for email document based on the similarity between theme using KMeans algorithm. With this clustering, system can make some group of documents to grouping all of documents and provide most important keywords for each groups;

  • provide some graphics about the history of email collections, for example system can provide about the number of email in any period time, who are the most participate in our email, etc.

Finally, there are many fields that can be helped with BuddyMiner, to help customer services to analyze their customer email service to know about some important things about the history and content of the emails.

# 8:54 am

Your feedback, please...