A Practical Approach to Novel Class Discovery in Tabular Data: Full training procedure

cover
26 May 2024

Authors:

(1) Troisemaine Colin, Department of Computer Science, IMT Atlantique, Brest, France., and Orange Labs, Lannion, France;

(2) Reiffers-Masson Alexandre, Department of Computer Science, IMT Atlantique, Brest, France.;

(3) Gosselin Stephane, Orange Labs, Lannion, France;

(4) Lemaire Vincent, Orange Labs, Lannion, France;

(5) Vaton Sandrine, Department of Computer Science, IMT Atlantique, Brest, France.

Abstract and Intro

Related work

Approaches

Hyperparameter optimization

Estimating the number of novel classes

Full training procedure

Experiments

Conclusion

Declarations

References

Appendix A: Additional result metrics

Appendix B: Hyperparameters

Appendix C: Cluster Validity Indices numerical results

Appendix D: NCD k-means centroids convergence study

6 Full training procedure

In the previous sections, we presented the models, the hyperparameter optimization and the estimation procedure of the number of novel classes independently. In this section, these components are brought together to form a complete training procedure. To ensure that no prior knowledge about the novel classes is ever used in this process, the number of novel classes is naturally estimated during the k-fold CV introduced in Section 4. As the whole process is quite complex, we try to summarize it in clear terms in this section and in Algorithm 1.

This paper is available on arxiv under CC 4.0 license.