Mitotic counting is a vital prognostic marker of tumor proliferation in breast cancer. Deep learning-based mitotic detection is on par with pathologists, but it requires large labeled data for training. We propose a deep classification framework for enhancing mitosis detection by leveraging class label information, via softmax loss, and spatial distribution information among samples, via distance metric learning. We also investigate strategies towards steadily providing informative samples to boost the learning. The efficacy of the proposed framework is established through evaluation on ICPR 2012 and AMIDA 2013 mitotic data. Our framework significantly improves the detection with small training data and achieves on par or superior performance compared to state-of-the-art methods for using the entire training data.
This paper studies vehicle attribute recognition by appearance. In the literature, image-based target recognition has been extensively investigated in many use cases, such as facial recognition, but less so in the field of vehicle attribute recognition. We survey a number of algorithms that identify vehicle properties ranging from coarse-grained level (vehicle type) to fine-grained level (vehicle make and model). Moreover, we discuss two alternative approaches for these tasks, including straightforward classification and a more flexible metric learning method. Furthermore, we design a simulated real-world scenario for vehicle attribute recognition and present an experimental comparison of the two approaches.
Recent successes of deep learning-based recognition rely on maintaining the content related to the main-task label. However, how to explicitly dispel the noisy signals for better generalization in a controllable manner remains an open issue. For instance, various factors such as identity-specific attributes, pose, illumination and expression affect the appearance of face images. Disentangling the identity-specific factors is potentially beneficial for facial expression recognition (FER). This chapter systematically summarize the detrimental factors as task-relevant/irrelevant semantic variations and unspecified latent variation. In this chapter, these problems are casted as either a deep metric learning problem or an adversarial minimax game in the latent space. For the former choice, a generalized adaptive (N+M)-tuplet clusters loss function together with the identity-aware hard-negative mining and online positive mining scheme can be used for identity-invariant FER. The better FER performance can be achieved by combining the deep metric loss and softmax loss in a unified two fully connected layer branches framework via joint optimization. For the latter solution, it is possible to equipping an end-to-end conditional adversarial network with the ability to decompose an input sample into three complementary parts. The discriminative representation inherits the desired invariance property guided by prior knowledge of the task, which is marginal independent to the task-relevant/irrelevant semantic and latent variations. The framework achieves top performance on a serial of tasks, including lighting, makeup, disguise-tolerant face recognition and facial attributes recognition. This chapter systematically summarize the popular and practical solution for disentanglement to achieve more discriminative visual recognition.