Research Collection School Of Computing and Information Systems

Web Unit Based Mining of Homepage Relationships

Aixin SUN, Nanyang Technological UniversityFollow
Ee Peng LIM, Singapore Management UniversityFollow

Publication Type

Journal Article

Publication Date

2-2006

Abstract

Homepages usually describe important semantic information about conceptual or physical entities; hence, they are the main targets for searching and browsing. To facilitate semantic-based information retrieval (IR) at a Web site, homepages can be identified and classified under some predefined concepts and these concepts are then used in query or browsing criteria, e.g., finding professor homepages containing information retrieval. In some Web sites, relationships may also exist among homepages. These relationship instances (also known as homepage relationships) enrich our knowledge about these Web sites and allow more expressive semantic-based IR. In this article, we investigate the features to be used in mining homepage relationships. We systematically develop different classes of inter-homepage features, namely, navigation, relative-location, and common-item features. We also propose deriving for each homepage a set of support pages to obtain richer and more complete content about the entity described by the homepage. The homepage together with its support pages are known to be a Web unit. By extracting inter-homepage features from Web units, our experiments on the WebKB dataset show that better homepage relationship mining accuracies can be achieved. [PUBLICATION ABSTRACT]

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Publication

Journal of the American Society for Information Science and Technology (JASIST)

Volume

Issue

First Page

394

Last Page

407

ISSN

1532-2882

Identifier

10.1002/asi.20279

Publisher

Wiley

Citation

SUN, Aixin and LIM, Ee Peng. Web Unit Based Mining of Homepage Relationships. (2006). Journal of the American Society for Information Science and Technology (JASIST). 57, (3), 394-407.
Available at: https://ink.library.smu.edu.sg/sis_research/201

Additional URL

http://dx.doi.org/10.1002/asi.20279

Link to Full Text

Find it in your library

COinS

Research Collection School Of Computing and Information Systems

Web Unit Based Mining of Homepage Relationships

Publication Type

Publication Date

Abstract

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Additional URL

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Web Unit Based Mining of Homepage Relationships

Author

Publication Type

Publication Date

Abstract

Discipline

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Additional URL

Share

Search

Links

Browse

Links