Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

Wickstrøm, Kristoffer; Løkse, Sigurd Eivindson; Kampffmeyer, Michael; Yu, Shujian; Príncipe, José C.; Jenssen, Robert

dc.contributor.author	Wickstrøm, Kristoffer
dc.contributor.author	Løkse, Sigurd Eivindson
dc.contributor.author	Kampffmeyer, Michael
dc.contributor.author	Yu, Shujian
dc.contributor.author	Príncipe, José C.
dc.contributor.author	Jenssen, Robert
dc.date.accessioned	2023-09-18T08:46:46Z
dc.date.available	2023-09-18T08:46:46Z
dc.date.created	2023-08-30T11:33:30Z
dc.date.issued	2023
dc.identifier.citation	Entropy. 2023, 25 (6)	en_US
dc.identifier.issn	1099-4300
dc.identifier.uri	https://hdl.handle.net/11250/3090001
dc.description.abstract	Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.	en_US
dc.language.iso	eng	en_US
dc.publisher	MDPI	en_US
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.subject	Dyp læring	en_US
dc.subject	Deep learning	en_US
dc.subject	Kjernemetoder	en_US
dc.subject	Kernel methods	en_US
dc.subject	Information plane	en_US
dc.subject	Informasjonsteori	en_US
dc.subject	Information theory	en_US
dc.title	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy	en_US
dc.title.alternative	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy	en_US
dc.type	Journal article	en_US
dc.type	Peer reviewed	en_US
dc.description.version	publishedVersion	en_US
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	1
dc.identifier.doi	10.3390/e25060899
dc.identifier.cristin	2170874
dc.source.journal	Entropy	en_US
dc.source.volume	25	en_US
dc.source.issue	6	en_US
dc.source.pagenumber	21	en_US
dc.subject.nsi	VDP::Mathematics and natural science: 400	en_US
dc.subject.nsi	VDP:: Information and communication science: 420	en_US
dc.subject.nsi	VDP:: Mathematical modeling and numerical methods: 427	en_US

Tilhørende fil(er)

Filnavn:: entropy-25-00899-v2.pdf
Størrelse:: 912.5Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Publikasjoner fra Cristin [288]
Vitenskapelige tidsskriftartikler og konferanseartikler med fagfellevurdering (NVI-kategori) [219]
Vitenskapelige tidsskriftartikler og konferanseartikler med fagfellevurdering (NVI-kategori)

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal