Question: I am working on a survival analysis task using lung cancer data, which includes CT images ( DICOM format ) , segmentation masks, and clinical

I am working on a survival analysis task using lung cancer data, which includes CT images (DICOM format), segmentation masks, and clinical data. Ive extracted radiomic features using PyRadiomics, but I need guidance on the best preprocessing and modeling approaches for survival analysis. The model's output should be time-to-event (time and event status). I have several specific questions:
Data preprocessing and variability in slices: Each patient has a different number of CT slices and corresponding segmentation masks. How should I handle this variability during preprocessing? Is there an optimal way to resample, interpolate, or pad the slices so that the input data can be consistent across patients?
Handling CT images and segmentation masks: Should the segmentation masks be used as additional input channels along with the CT images, or should they be processed separately and combined later? How can I align these masks with the CT slices if the number of slices is inconsistent between patients?
Input for CNN-based models: Given the variable number of slices, what is the best approach to format the CT images (and possibly masks) as input to a CNN? Should I use a 2D CNN on individual slices or a 3D CNN to process the full volume of slices per patient? How should I structure the input if the number of slices varies?
Feature extraction vs. raw image input: Considering I have already extracted radiomic features, should I use these features directly for survival analysis, or is it more beneficial to input the raw CT images into a deep learning model (CNN) for automatic feature learning? How can I combine these extracted features with clinical data and imaging data?
Multimodal approaches: If I want to combine multiple data modalities (CT images, segmentation masks, and clinical data), what is the best strategy to preprocess and merge these data types for survival prediction? Are there architectures specifically suited for integrating multimodal inputs?
Survival-specific deep learning models: Are there specific CNN architectures designed for survival analysis that account for censored data and time-to-event predictions? Should I adapt a standard CNN architecture to include survival-specific loss functions like Cox proportional hazards or time-to-event loss?
Dealing with multiple CT slices for survival analysis: How should I aggregate information from multiple CT slices, given the variability in the number of slices per patient? Is it more effective to summarize the features from each slice and then apply a pooling mechanism, or should I use 3D convolutions to process the entire volume? What are the trade-offs between these approaches?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!