6th Annual International Conference on Industrial Engineering and Operations Management

In-house Crowdsourcing-based Entity Resolution using argumentation: Issue of Common Names

Morteza Saberi
Publisher: IEOM Society International
0 Paper Citations
1 Views
1 Downloads
Track: Graduate Student Paper Competition
Abstract

The increasing use of information and communications technology (ICT) by the customers in order to interact with the companies has resulted in the high quality service delivery expectation. Also, the customer’s access to various communication channels produces a diverse and enormous amount of unstructured and semi-structured data for organisation to store. This leads to issues of customers duplicate profile in the database. Hence it is essential for companies to have appropriate data cleansing strategies in order to address the customer concerns efficiently. Entity Resolution (ER) is one of the techniques in the data cleansing area that has been used to disambiguate the various manifestations of the same real-world object in a database to improve the search results. Recently, Crowdsourcing has been popularly utilized to improve entity resolution that takes into account human intelligence for ER. However, it might be possible that a difference of opinion may arise among the crowd during the process of ER. Argumentation formalism can be used to identify and resolve these conflicts with justifications. In this study, I am considering issue of common name in the domain of a Customer Relationship Management (CRM) and propose Crowdsourcing-based argumentation methodology for improving the process of achieving ER. The use of argumentation-driven crowd labelling scheme for the pair of records helps the organization in generating a training data for ER algorithm. In this case, a probabilistic framework identifies a pair of record (maximum impact for ER) and forwards it to the crowd for labelling (i.e. either both records represent that same entity or not). It is important to note here that customer service representatives are considered as the crowd. The CSRs give their opinion (labelling) in form of arguments either in support or against the pair for ER.  Using an argumentation conflict resolution algorithm the final judgement is executed about the labelling i.e. matched or not matched.  Our contributions in this study are three folds. Firstly, the probabilistic method is used in order to identify the most beneficial pair from various duplicate profiles that might have the maximum impact in achieving ER. Secondly, CSR opinions is utilized for ER in form of arguments either as matched or not matched (labelling). Thirdly, argumentation conflict resolution algorithm is used to determine the final label and forward the result to the ER. 

Published in: 6th Annual International Conference on Industrial Engineering and Operations Management, Kuala Lumpur, Malaysia

Publisher: IEOM Society International
Date of Conference: March 8-10, 2016

ISBN: 978-0-9855497-4-9
ISSN/E-ISSN: 2169-8767