** In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields. **

** Embedded and Internet-of-Things (IoT) devices have seen an increase in adoption in many domains. The security of these devices is of great importance as they are often used to control critical infrastructure, medical devices, and vehicles. Existing solutions to isolate microcontroller (MCU) resources in order to increase their security face significant challenges such as specific hardware unavailability, Memory Protection Unit (MPU) limitations and a significant lack of Direct Memory Access (DMA) support. Nevertheless, DMA is fundamental for the power and performance requirements of embedded applications. In this paper, we present D-Box, a systematic approach to enable secure DMA operations for compartmentalization solutions of embedded applications using real-time operating systems (RTOS). D-Box defines a reference architecture and a workflow to protect DMA operations holistically. It provides practical methods to harden the kernel and define capability-based security policies for easy definition of DMA operations with strong security properties. We implemented a D-Box prototype for the Cortex-M3/M4 on top of the popular FreeRTOS-MPU (F-MPU). The D-Box procedures and a stricter security model enabled DMA operations, yet it exposed 41 times less ROP (return-orienting-programming) gadgets when compared with the standard F-MPU. D-Box adds only a 2% processor overhead while reducing the power consumption of peripheral operation benchmarks by 18.2%. The security properties and performance of D-Box were tested and confirmed on a real-world case study of a Programmable Logic Controller (PLC) application. **

** In this work, we study a variant of nonnegative matrix factorization where we wish to find a symmetric factorization of a given input matrix into a sparse, Boolean matrix. Formally speaking, given $\mathbf{M}\in\mathbb{Z}^{m\times m}$, we want to find $\mathbf{W}\in\{0,1\}^{m\times r}$ such that $\| \mathbf{M} - \mathbf{W}\mathbf{W}^\top \|_0$ is minimized among all $\mathbf{W}$ for which each row is $k$-sparse. This question turns out to be closely related to a number of questions like recovering a hypergraph from its line graph, as well as reconstruction attacks for private neural network training. As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation. Equivalently, this can be thought of as recovering a uniformly random $k$-uniform hypergraph from its line graph. Our main result is a polynomial-time algorithm for this problem based on bootstrapping higher-order information about $\mathbf{W}$ and then decomposing an appropriate tensor. The key ingredient in our analysis, which may be of independent interest, is to show that such a matrix $\mathbf{W}$ has full column rank with high probability as soon as $m = \widetilde{\Omega}(r)$, which we do using tools from Littlewood-Offord theory and estimates for binary Krawtchouk polynomials. **

** This paper uses the concept of algorithmic efficiency to present a unified theory of intelligence. Intelligence is defined informally, formally, and computationally. I introduce the concept of Dimensional complexity in algorithmic efficiency and deduce that an optimally efficient algorithm has zero Time complexity, zero Space complexity, and an infinite Dimensional complexity. This algorithm is then used to generate the number line. **

** Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. Consequently, there has been significant work on efficiently approximating matrix multiplies. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. Experiments using hundreds of matrices from diverse domains show that it often runs $100\times$ faster than exact matrix products and $10\times$ faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds. These results suggest that a mixture of hashing, averaging, and byte shuffling$-$the core operations of our method$-$could be a more promising building block for machine learning than the sparsified, factorized, and/or scalar quantized matrix products that have recently been the focus of substantial research and hardware investment. **

** We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above. **

** Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space. **

A Convolutional Feature Map based Deep Network targeted towards Traffic Detection and Classification

** This research mainly emphasizes on traffic detection thus essentially involving object detection and classification. The particular work discussed here is motivated from unsatisfactory attempts of re-using well known pre-trained object detection networks for domain specific data. In this course, some trivial issues leading to prominent performance drop are identified and ways to resolve them are discussed. For example, some simple yet relevant tricks regarding data collection and sampling prove to be very beneficial. Also, introducing a blur net to deal with blurred real time data is another important factor promoting performance elevation. We further study the neural network design issues for beneficial object classification and involve shared, region-independent convolutional features. Adaptive learning rates to deal with saddle points are also investigated and an average covariance matrix based pre-conditioned approach is proposed. We also introduce the use of optical flow features to accommodate orientation information. Experimental results demonstrate that this results in a steady rise in the performance rate. **

** Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning. **