Vis enkel innførsel

dc.contributor.authorWickstrøm, Kristoffer
dc.contributor.authorLøkse, Sigurd Eivindson
dc.contributor.authorKampffmeyer, Michael
dc.contributor.authorYu, Shujian
dc.contributor.authorPríncipe, José C.
dc.contributor.authorJenssen, Robert
dc.date.accessioned2023-09-18T08:46:46Z
dc.date.available2023-09-18T08:46:46Z
dc.date.created2023-08-30T11:33:30Z
dc.date.issued2023
dc.identifier.citationEntropy. 2023, 25 (6)en_US
dc.identifier.issn1099-4300
dc.identifier.urihttps://hdl.handle.net/11250/3090001
dc.description.abstractAnalyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.en_US
dc.language.isoengen_US
dc.publisherMDPIen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.subjectDyp læringen_US
dc.subjectDeep learningen_US
dc.subjectKjernemetoderen_US
dc.subjectKernel methodsen_US
dc.subjectInformation planeen_US
dc.subjectInformasjonsteorien_US
dc.subjectInformation theoryen_US
dc.titleAnalysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropyen_US
dc.title.alternativeAnalysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropyen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionpublishedVersionen_US
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1
dc.identifier.doi10.3390/e25060899
dc.identifier.cristin2170874
dc.source.journalEntropyen_US
dc.source.volume25en_US
dc.source.issue6en_US
dc.source.pagenumber21en_US
dc.subject.nsiVDP::Mathematics and natural science: 400en_US
dc.subject.nsiVDP:: Information and communication science: 420en_US
dc.subject.nsiVDP:: Mathematical modeling and numerical methods: 427en_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal