ACTA

Functional Dependency Extraction with Sequential Indexed Search Trees

B. Tusor^{a, b}, A.R. Várkonyi-Kóczy^{a, b}, S. Gubo^b
^aSoftware Engineering Institute, John von Neumann Faculty of Informatics, Óbuda University, Bécsiút 96/b, H-1024 Budapest, Hungary
^bDepartment of Informatics, J. Selye University, 3322 Bratislavská cesta, 945 01 Komárno, Slovakia

Full Text PDF

Functional dependency analysis is an important field of data science, where the goal is to determine the relationships between different data attributes and attribute sets in a given data set. This can lead to gaining valuable information about the data that is often not evident through surface-level analysis. In previous work, the authors proposed a functional dependency extraction method called Sequential Indexing Tables, which is a specialized variant of the Sequential Fuzzy Indexing Tables (SFITs) classifiers. SFITs combined lookup table classifiers with fuzzy logic to implement a very fast yet flexible classification. A special feature of the SFIT classifier is that its structure indicates the functional dependencies between the data attributes that are present in the training data set. However, the main disadvantage of SFITs is that they require a significant part of the problem space to be stored in the computer memory, scaling exponentially with the number of attributes. To solve this issue, a new classifier called Sequential Fuzzy Indexed Search Trees (SFISTs) has been proposed by the authors, which builds on the same idea as SFITs but uses a more compact structure while providing a slightly better classification accuracy. In this paper, the functional dependency detection and extraction method is presented, which is a specialized version of the SFISTs classifier that uses the same base idea as its predecessor but with a much smaller spatial complexity.

DOI:10.12693/APhysPolA.146.515
topics: functional dependency (FD) extraction, data analysis, data science, indexing tables