Question: I am facing an issue because I am trying to concatenate a one hot encoding vector of size 5 with the patches of the image
I am facing an issue because I am trying to concatenate a one hot encoding vector of size with the patches of the image I know I am using a different approach in the code I am using a pretrained vision transformer model vitbasepatch It is a multilabel classification problem.
Ideally, I would feed the model the image and the resolution and I would like the model to predict the most suitable labels. An image of the dataset is attached below.
And this is the model I am using; I have done all the necessary steps :
class ResizingModelMultilabelImageClassificationBase:
def initself numclasses:
superResizingModel selfinit
self.numclasses numclasses
self.vit ViTModel.frompretrainedgooglevitbasepatchink
self.fc nnLinearselfvit.config.hiddensize numclasses
# printfInput size of self.clsembeddingsize: selfvit.config.hiddensize
# printfOutput size of self.fc: numclasses
def forwardself x targetresolutiononehot:
# Process input image through Vision Transformer
vitoutput self.vitx
# Extract the CLS token embedding from the Vision Transformer output from the last hidden state
#the hidden state refers to the information
clstokenembedding vitoutput.lasthiddenstate: : # Use only the CLS token
# printShape of clstokenembedding:", clstokenembedding.shape # Shape
# printShape of target res:", targetresolutiononehot.shape # Shape
concatenatedinput torch.catclstokenembedding, targetresolutiononehot dim
# Apply linear layer
logits self.fcconcatenatedinput
return logits
I have also attached an image of the sample output, my main concern is that the accuracy is extremely low ~
I would appreciate it if you could help me in solve this problem.
tableimageresolution,CRSCSNSSCLimagescity.jpgimagescity.jpgimagescity.jpgimagescity.jpgimagescity.jpgimages crowd.jpgimages crowd.jpgimages crowd.jpgimages crowd.jpg
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
