Workshop on Knowledge Discovery, Data Mining and Machine Learning (2009, Darmstadt)
Tagungsort:
Darmstadt
Jahr der Konferenz:
2009
Datum Beginn der Konferenz:
21.09.2009
Datum Ende der Konferenz:
23.09.2009
Verlagsort:
Darmstadt, Germany
Jahr:
2009
Seiten von - bis:
68-71
Sprache:
Englisch
Abstract:
This paper presents an approach to extract data records from websites, particularly ones with event calendars. We therefore use language-specific key expressions and HTML patterns to recognize every single event given on the investigated web page. One of the most remarkable advantages of our method is that it does not require any additional classification steps based on machine learning algorithms or keyword extraction methods; it is a so-called one-step mining technique. Our experimental results obtained on German opera websites show excellent results in precision and recall. Furthermore, we could demonstrate that our proposed technique outperforms other data record mining applications run on event sites. «
This paper presents an approach to extract data records from websites, particularly ones with event calendars. We therefore use language-specific key expressions and HTML patterns to recognize every single event given on the investigated web page. One of the most remarkable advantages of our method is that it does not require any additional classification steps based on machine learning algorithms or keyword extraction methods; it is a so-called one-step mining technique. Our experimental resul... »