If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text. We provide a general existence result by using a new improved Moser—Trudinger-type inequality and introducing a topological join construction in order to describe the interaction of the two components u 1 and u 2. Source Anal. PDE , Volume 8, Number 8 , Zentralblatt MATH identifier Keywords geometric PDEs variational methods min-max schemes. Electronic Research Announcements , , Tong Li , Hui Yin.
Convergence rate to strong boundary layer solutions for generalized BBM-Burgers equations with non-convex flux. Isaac Alvarez-Romero , Gerald Teschl. On uniqueness properties of solutions of the Toda and Kac-van Moerbeke hierarchies. Haiyan Yin , Changjiang Zhu. Convergence rate of solutions toward stationary solutions to a viscous liquid-gas two-phase flow model in a half line. Stationary waves to the two-fluid non-isentropic Navier-Stokes-Poisson system in a half line: Existence, stability and convergence rate.
Classification of positive solutions to a Lane-Emden type integral system with negative exponents. Convergence to equilibria of solutions to a conserved Phase-Field system with memory. American Institute of Mathematical Sciences. Keywords: convergence rate , Toda system , sharp estimates , bubbling solutions. Citation: Weiwei Ao. References:  W. Google Scholar  W. Google Scholar  D.
Google Scholar  J. Google Scholar  L. Google Scholar  L. Google Scholar  D. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  S. Google Scholar  G. Google Scholar  G. Google Scholar  G. Google Scholar  A. Google Scholar  P. Google Scholar  M. Google Scholar  J. Google Scholar  J. Google Scholar  T. Google Scholar  Y. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C.
Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  C. Google Scholar  A. Google Scholar  A. Google Scholar  A. The common strategy to train predictive model is disambiguation, i.
Recently, feature-aware disambiguation was proposed to generate different labeling confidences over candidate label set by utilizing the graph structure of feature space. However, the existence of noise and outliers in training data makes the similarity derived from original features less reliable. To this end, we proposed a novel approach for partial label learning based on adaptive graph guided disambiguation PL-AGGD. Compared with fixed graph, adaptive graph could be more robust and accurate to reveal the intrinsic manifold structure within the data. Moreover, instead of the two-stage strategy in previous algorithms, our approach performs label disambiguation and predictive model training simultaneously.
Specifically, we present a unified framework which jointly optimizes the ground-truth labeling confidences, similarity graph and model parameters to achieve strong generalization performance.
A Variational Analysis of the Toda System on Compact Surfaces
Extensive experiments show that PL-AGGD performs favorably against state-of-the-art partial label learning approaches. Attributed networks are pervasive in numerous of high-impact domains. As opposed to conventional plain networks where only pairwise node dependencies are observed, both the network topology and node attribute information are readily available on attributed networks. More often than not, the nodal attributes are depicted in a high-dimensional feature space and are therefore notoriously difficult to tackle due to the curse of dimensionality. Additionally, features that are irrelevant to the network structure could hinder the discovery of actionable patterns from attributed networks.
Hence, it is important to leverage feature selection to find a high-quality feature subset that is tightly correlated to the network structure. Few of the existing efforts either model the network structure at a macro-level by community analysis or directly make use of the binary relations. Consequently, they fail to exploit the finer-grained tie strength information for feature selection and may lead to suboptimal results.
Motivated by the sociology findings, in this work, we investigate how to harness the tie strength information embedded on the network structure to facilitate the selection of relevant nodal attributes. Methodologically, we propose a principled unsupervised feature selection framework ADAPT to find informative features that can be used to regenerate the observed links and further characterize the adaptive neighborhood structure of the network.
Extensive experimental studies on various real-world attributed networks validate the superiority of the proposed ADAPT framework. Early classification of time series is the prediction of the class label of a time series before it is observed in its entirety. In time-sensitive domains where information is collected over time it is worth sacrificing some classification accuracy in favor of earlier predictions, ideally early enough for actions to be taken. However, since accuracy and earliness are contradictory objectives, a solution must address this challenge to discover task-dependent trade-offs.
We design an early classification model, called EARLIEST, which tackles this multi-objective optimization problem, jointly learning 1 to classify time series and 2 at which timestep to halt and generate this prediction. By learning the objectives together, we achieve a user-controlled balance between these contradictory goals while capturing their natural relationship.
Our model consists of the novel pairing of a recurrent discriminator network with a stochastic policy network, with the latter learning a halting-policy as a reinforcement learning task. The learned policy interprets representations generated by the recurrent model and controls its dynamics, sequentially deciding whether or not to request observations from future timesteps. For a rich variety of datasets four synthetic and three real-world , we demonstrate that EARLIEST consistently out-performs state-of-the-art alternatives in accuracy and earliness while discovering signal locations without supervision.
Alternating Direction Method of Multipliers ADMM has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent SGD as a deep learning optimizer.
However, as an emerging domain, several challenges remain, including 1 The lack of global convergence guarantees, 2 Slow convergence towards solutions, and 3 Cubic time complexity with regard to feature dimensions. The parameters in each layer are updated backward and then forward so that the parameter information in each layer is exchanged efficiently. The time complexity is reduced from cubic to quadratic in latent feature dimensions via a dedicated algorithm design for subproblems that enhances them utilizing iterative quadratic approximations and backtracking.
Experiments on benchmark datasets demonstrated that our proposed dlADMM algorithm outperforms most of the comparison methods. Network embedding, which aims to represent network data in a low-dimensional space, has been commonly adopted for analyzing heterogeneous information networks HIN. Although exiting HIN embedding methods have achieved performance improvement to some extent, they still face a few major weaknesses. Most importantly, they usually adopt negative sampling to randomly select nodes from the network, and they do not learn the underlying distribution for more robust embedding.
Compared to existing HIN embedding methods, our generator would learn the node distribution to generate better negative samples. Compared to GANs on homogeneous networks, our discriminator and generator are designed to be relation-aware in order to capture the rich semantics on HINs. Furthermore, towards more effective and efficient sampling, we propose a generalized generator, which samples "latent" nodes directly from a continuous distribution, not confined to the nodes in the original network as existing methods are. Finally, we conduct extensive experiments on four real-world datasets.
Results show that we consistently and significantly outperform state-of-the-art baselines across all datasets and tasks. Mobile user profiles are a summary of characteristics of user-specific mobile activities. Mobile user profiling is to extract a user's interest and behavioral patterns from mobile behavioral data. While some efforts have been made for mobile user profiling, existing methods can be improved via representation learning with awareness of substructures in users' behavioral graphs. Specifically, in this paper, we study the problem of mobile users profiling with POI check-in data.
To this end, we first construct a graph, where a vertex is a POI category and an edge is the transition frequency of a user between two POI categories, to represent each user. We then formulate mobile user profiling as a task of representation learning from user behavioral graphs. We later develop a deep adversarial substructured learning framework for the task. This framework has two mutually-enhanced components. The first component is to preserve the structure of the entire graph, which is formulated as an encoding-decoding paradigm.
In particular, the structure of the entire graph is preserved by minimizing reconstruction loss between an original graph and a reconstructed graph. The second component is to preserve the structure of subgraphs, which is formulated as a substructure detector based adversarial training paradigm.
- Child Abuse Sourcebook!
- Integrable system.
- Christ Before the Manger: The Life and Times of the Preincarnate Christ.
- ArcUser Online.
- Sharp estimates for fully bubbling solutions of B_2 Toda system.
In particular, this paradigm includes a substructure detector and an adversarial trainer. Instead of using non-differentiable substructure detection algorithms, we pre-train a differentiable convolutional neural network as the detector to approximate these detection algorithms. The adversarial trainer is to match the detected substructure of the reconstructed graph to the detected substructure of the original graph. Also, we provide an effective solution for the optimization problems. Moreover, we exploit the learned representations of users for the next activity type prediction.
Finally, we present extensive experimental results to demonstrate the improved performances of the proposed method. Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models e. However, the latent code learned by the traditional VAE is not exclusive repeatable for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution.
Moreover, the semi-supervised GAN models generate data from pre-defined distribution e. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding AVAE framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. We propose the first adversarially robust algorithm for monotone submodular maximization under single and multiple knapsack constraints with scalable implementations in distributed and streaming settings. For a single knapsack constraint, our algorithm outputs a robust summary of almost optimal up to polylogarithmic factors size, from which a constant-factor approximation to the optimal solution can be constructed.
For multiple knapsack constraints, our approximation is within a constant-factor of the best known non-robust solution. We evaluate the performance of our algorithms by comparison to natural robustifications of existing non-robust algorithms under two objectives: 1 dominating set for large social network graphs from Facebook and Twitter collected by the Stanford Network Analysis Project SNAP , 2 movie recommendations on a dataset from MovieLens.
Experimental results show that our algorithms give the best objective for a majority of the inputs and show strong performance even compared to offline algorithms that are given the set of removals in advance. Traditional recommender systems rely on user feedback such as ratings or clicks to the items, to analyze the user interest and provide personalized recommendations. However, rating or click feedback are limited in that they do not exactly tell why users like or dislike an item. If a user does not like the recommendations and can not effectively express the reasons via rating and clicking, the feedback from the user may be very sparse.
These limitations lead to inefficient model learning of the recommender system.
To address these limitations, more effective user feedback to the recommendations should be designed, so that the system can effectively understand a user's preference and improve the recommendations over time. In this paper, we propose a novel dialog-based recommender system to interactively recommend a list of items with visual appearance.
At each time, the user receives a list of recommended items with visual appearance. The user can point to some items and describe their feedback, such as the desired features in the items they want in natural language. With this natural language based feedback, the recommender system updates and provides another list of items. To model the user behaviors of viewing, commenting and clicking on a list of items, we propose a visual dialog augmented cascade model.
To efficiently understand the user preference and learn the model, exploration should be encouraged to provide more diverse recommendations to quickly collect user feedback on more attributes of the items. We propose a variant of the cascading bandits, where the neural representations of the item images and user feedback in natural language are utilized. In a task of recommending a list of footwear, we show that our visual dialog augmented interactive recommender needs around We introduce and release a new large-scale dataset based on Wikipedia and Wikidata to train relation classifiers and end-to-end fact extraction models.
The end-to-end models are shown to be able to extract complete sets of facts from datasets with full pages of text. We then analyse multiple models that estimate factual accuracy on a Wikipedia text summarization task, and show their efficacy compared to ROUGE and other model-free variants by conducting a human evaluation study. Visualization of high-dimensional data is a fundamental yet challenging problem in data mining.
These visualization techniques are commonly used to reveal the patterns in the high-dimensional data, such as clusters and the similarity among clusters. Recently, some successful visualization tools e. However, there are two limitations with them : 1 they cannot capture the global data structure well. Thus, their visualization results are sensitive to initialization, which may cause confusions to the data analysis.
They are not suitable to be implemented on the GPU platform because their complex algorithm logic, high memory cost, and random memory access mode will lead to low hardware utilization. To address the aforementioned problems, we propose a novel visualization approach named as Anchor-t-SNE AtSNE , which provides efficient GPU-based visualization solution for large-scale and high-dimensional data.
Specifically, we generate a number of anchor points from the original data and regard them as the skeleton of the layout, which holds the global structure information. We propose a hierarchical optimization approach to optimize the positions of the anchor points and ordinary data points in the layout simultaneously. Our approach presents much better and robust visual effects on 11 public datasets, and achieve 5 to 28 times speed-up on different datasets, compared with the current state-of-the-art methods.
In particular, we deliver a high-quality 2-D layout for a 20 million and dimension dataset within 5 hours, while the current methods fail to give results due to running out of the memory. Backbones refer to critical tree structures that span a set of nodes of interests in networks. This paper introduces a novel class of attributed backbones and detection algorithms in richly attributed networks.
Unlike conventional models, attributed backbones capture dynamics in edge cost model: it specifies affinitive attributes for each edge, and the cost of each edge is dynamically determined by the selection of its associated affinitive attributes and the closeness of their values at its end nodes. The backbone discovery is to compute an attributed backbone that covers interested nodes with smallest connection cost dynamically determined by selected affinitive attributes.
KDD | Proceedings
While this problem is hard to approximate, we develop feasible algorithms within practical reach for large attributed networks. Using real-world networks, we verify the effectiveness and efficiency of our algorithms and show their applications in collaboration recommendation. To help enforce data-protection regulations such as GDPR and detect unauthorized uses of personal data, we develop a new model auditing technique that helps users check if their data was used to train a machine learning model.
We focus on auditing deep-learning models that generate natural-language text, including word prediction and dialog generation. These models are at the core of popular online services and are often trained on personal data such as users' messages, searches, chats, and comments.
We design and evaluate a black-box auditing method that can detect, with very few queries to a model, if a particular user's texts were used to train it among thousands of other users. We empirically show that our method can successfully audit well-generalized models that are not overfitted to the training data. We also analyze how text-generation models memorize word sequences and explain why this memorization makes them amenable to auditing. Feature selection is the preprocessing step in machine learning which tries to select the most relevant features for the subsequent prediction task.
Effective feature selection could help reduce dimensionality, improve prediction accuracy and increase result comprehensibility. It is very challenging to find the optimal feature subset from the subset space as the space could be very large. While much effort has been made by existing studies, reinforcement learning can provide a new perspective for the searching strategy in a more global way.
In this paper, we propose a multi-agent reinforcement learning framework for the feature selection problem. Specifically, we first reformulate feature selection with a reinforcement learning framework by regarding each feature as an agent. Then, we obtain the state of environment in three ways, i. We show how to learn the state representation in a graph-based way, which could tackle the case when not only the edges, but also the nodes are changing step by step.
In addition, we study how the coordination between different features would be improved by more reasonable reward scheme. The proposed method could search the feature subset space globally and could be easily adapted to the real-time case real-time feature selection due to the nature of reinforcement learning. Also, we provide an efficient strategy to accelerate the convergence of multi-agent reinforcement learning. Finally, extensive experimental results show the significant improvement of the proposed method over conventional approaches.
Network embedding NE aims to embed the nodes of a network into a vector space, and serves as the bridge between machine learning and network data. Despite their widespread success, NE algorithms typically contain a large number of hyperparameters for preserving the various network properties, which must be carefully tuned in order to achieve satisfactory performance. Though automated machine learning AutoML has achieved promising results when applied to many types of data such as images and texts, network data poses great challenges to AutoML and remains largely ignored by the literature of AutoML.
The biggest obstacle is the massive scale of real-world networks, along with the coupled node relationships that make any straightforward sampling strategy problematic. In this paper, we propose a novel framework, named AutoNE, to automatically optimize the hyperparameters of a NE algorithm on massive networks. In detail, we employ a multi-start random walk strategy to sample several small sub-networks, perform each trial of configuration selection on the sampled sub-network, and design a meta-leaner to transfer the knowledge about optimal hyperparameters from the sub-networks to the original massive network.
The transferred meta-knowledge greatly reduces the number of trials required when predicting the optimal hyperparameters for the original network. Extensive experiments demonstrate that our framework can significantly outperform the existing methods, in that it needs less time and fewer trials to find the optimal hyperparameters. Generalized additive models GAMs are favored in many regression and binary classification problems because they are able to fit complex, nonlinear functions while still remaining interpretable.
In the first part of this paper, we generalize a state-of-the-art GAM learning algorithm based on boosted trees to the multiclass setting, showing that this multiclass algorithm outperforms existing GAM learning algorithms and sometimes matches the performance of full complexity models such as gradient boosted trees. In the second part, we turn our attention to the interpretability of GAMs in the multiclass setting. Surprisingly, the natural interpretability of GAMs breaks down when there are more than two classes. Naive interpretation of multiclass GAMs can lead to false conclusions.
Inspired by binary GAMs, we identify two axioms that any additive model must satisfy in order to not be visually misleading. We then develop a technique called Additive Post-Processing for Interpretability API that provably transforms a pretrained additive model to satisfy the interpretability axioms without sacrificing accuracy.
The technique works not just on models trained with our learning algorithm, but on any multiclass additive model, including multiclass linear and logistic regression. We demonstrate the effectiveness of API on a class infant mortality dataset. An effective content recommendation in modern social media platforms should benefit both creators to bring genuine benefits to them and consumers to help them get really interesting content.
SEAN uses a personalized content recommendation model to encourage personal interests driven recommendation. Moreover, SEAN allows the personalization factors to attend to users' higher-order friends on the social network to improve the accuracy and diversity of recommendation results. Constructing two datasets from a popular decentralized content distribution platform, Steemit, we compare SEAN with state-of-the-art CF and content based recommendation approaches.
Experimental results demonstrate the effectiveness of SEAN in terms of both Gini coefficients for recommendation equality and F1 scores for recommendation performance. Recent works show that Graph Neural Networks GNNs are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable non- robustness of graph convolutional networks with respect to perturbations of the node attributes.
We consider the case of binary node attributes e. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy.
- Learning DCOM.
- Log in to Wiley Online Library.
- EMS - European Mathematical Society Publishing House.
- A Moser-Trudinger inequality for the singular Toda system - INSPIRE-HEP;
- Radiology of the Stomach and Duodenum (Medical Radiology).
- Happiness Is a Warm Gun.
Graph convolutional network GCN has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph.
This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset Reddit. Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracyusing a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score In this paper we consider clustering problems in which each point is endowed with a color.
The goal is to cluster the points to minimize the classical clustering cost but with the additional constraint that no color is over-represented in any cluster. This problem is motivated by practical clustering settings, e. For the most general version of this problem, we obtain an algorithm that has provable guarantees of performance; our algorithm is based on finding a fractional solution using a linear program and rounding the solution subsequently.
For the special case of the problem where no color has an absolute majority in any cluster, we obtain a simpler combinatorial algorithm also with provable guarantees. Experiments on real-world data shows that our algorithms are effective in finding good clustering without over-representation. Unlike the standard convolutional neural network, graph convolutional neural networks perform the convolutional operation on the graph data.
Compared with the generic data, the graph data possess the similarity information between different nodes. Thus, it is important to preserve this kind of similarity information in the hidden layers of graph convolutional neural networks. However, existing works fail to do that. On the other hand, it is challenging to enforce the hidden layers to preserve the similarity relationship. To address this issue, we propose a novel CRF layer for graph convolutional neural networks to encourage similar nodes to have similar hidden features.
In this way, the similarity information can be preserved explicitly.
In addition, the proposed CRF layer is easy to compute and optimize. Therefore, it can be easily inserted into existing graph convolutional neural networks to improve their performance. At last, extensive experimental results have verified the effectiveness of our proposed CRF layer. Modern search engines increasingly incorporate tabular content, which consists of a set of entities each augmented with a small set of facts. The facts can be obtained from multiple sources: an entity's knowledge base entry, the infobox on its Wikipedia page, or its row within a WebTable.
Crucially, the informativeness of a fact depends not only on the entity but also the specific context e. To the best of our knowledge, this paper is the first to study the problem of contextual fact ranking: given some entities and a context i. We propose to contextually rank the facts by exploiting deep learning techniques.
In particular, we develop pointwise and pair-wise ranking models, using textual and statistical information for the given entities and context derived from their sources. We enhance the models by incorporating entity type information from an IsA hypernym database. We further conduct user studies for two specific applications of contextual fact ranking-table synthesis and table compression-and show that our models can identify more informative facts than the baselines. Concepts are often described in terms of positive integer-valued attributes that are organized in a hierarchy. For example, cities can be described in terms of how many places there are of various types e.
This hierarchy imposes particular constraints on the values of related attributese. Moreover, knowing that a city has many food venues makes it less surprising that it also has many Portuguese restaurants, and vice versa. In the present paper, we attempt to characterize such concepts in terms of so-called contrastive antichains: particular kinds of subsets of their attributes and their values. We address the question of when a contrastive antichain is interesting, in the sense that it concisely describes the unique aspects of the concept, and this while duly taking into account the known attribute dependencies implied by the hierarchy.
Our approach is capable of accounting for previously identified contrastive antichains, making iterative mining possible. Besides the interestingness measure, we also present an algorithm that scales well in practice, and demonstrate the usefulness of the method in an extensive empirical results section. Taxi and sharing bike bring great convenience to urban transportation. A lot of efforts have been made to improve the efficiency of taxi service or bike sharing system by predicting the next-period pick-up or drop-off demand. Different from the existing research, this paper is motivated by the following two facts: 1 From a micro view, an observed spatial demand at any time slot could be decomposed as a combination of many hidden spatial demand bases; 2 From a macro view, the multiple transportation demands are strongly correlated with each other, both spatially and temporally.
Definitely, the above two views have great potential to revolutionize the existing taxi or bike demand prediction methods. In particular, a deep convolutional neural network is constructed to decompose a spatial demand into a combination of hidden spatial demand bases. The combination weight vector is used as a representation of the decomposed spatial demand.
Then, a heterogeneous Long Short-Term Memory LSTM is proposed to integrate the states of multiple transportation demands, and also model the dynamics of them mixedly. Last, the environmental features such as humidity and temperature are incorporated with the achieved overall hidden states to predict the multiple demands simultaneously.
What We Study
Experiments have been conducted on real-world taxi and sharing bike demand data, results demonstrate the superiority of the proposed method over both classical and the state-of-the-art transportation demand prediction methods. Coresets are important tools to generate concise summaries of massive datasets for approximate analysis. A coreset is a small subset of points extracted from the original point set such that certain geometric properties are preserved with provable guarantees. This paper investigates the problem of maintaining a coreset to preserve the minimum enclosing ball MEB for a sliding window of points that are continuously updated in a data stream.
Although the problem has been extensively studied in batch and append-only streaming settings, no efficient sliding-window solution is available yet. AOMEB improves the practical performance of the state-of-the-art algorithm while having the same approximation ratio. Low-rank tensor factorization has been widely used for many real world tensor completion problems.
While most existing factorization models assume a multilinearity relationship between tensor entries and their corresponding factors, real world tensors tend to have more complex interactions than multilinearity. In many recent works, it is observed that multilinear models perform worse than nonlinear models. We identify one potential reason for this inferior performance: the nonlinearity inside data obfuscates the underlying low-rank structure such that the tensor seems to be a high-rank tensor. Solving this problem requires a model to simultaneously capture the complex interactions and preserve the low-rank structure.
In addition, the model should be scalable and robust to missing observations in order to learn from large yet sparse real world tensors. Our model leverages the expressive power of CNN to model the complex interactions inside tensors and its parameter sharing scheme to preserve the desired low-rank structure. CoSTCo is scalable as it does not involve computation- or memory- heavy tasks such as Kronecker product. We conduct extensive experiments on several real world large sparse tensors and the experimental results show that our model clearly outperforms both linear and nonlinear state-of-the-art tensor completion methods.
We focus on the problem of streaming recommender system and explore novel collaborative filtering algorithms to handle the data dynamicity and complexity in a streaming manner. Although deep neural networks have demonstrated the effectiveness of recommendation tasks, it is lack of explorations on integrating probabilistic models and deep architectures under streaming recommendation settings.
Conjoining the complementary advantages of probabilistic models and deep neural networks could enhance both model effectiveness and the understanding of inference uncertainties. The framework jointly combines stochastic processes and deep factorization models under a Bayesian paradigm to model the generation and evolution of users' preferences and items' popularities. To ensure efficient optimization and streaming update, we further propose a sequential variational inference algorithm based on a cross variational recurrent neural network structure.
Experimental results on three benchmark datasets demonstrate that the proposed framework performs favorably against the state-of-the-art methods in terms of both temporal dependency modeling and predictive accuracy. The learned latent variables also provide visualized interpretations for the evolution of temporal dynamics.
Despite the great success of many matrix factorization based collaborative filtering approaches, there is still much space for improvement in recommender system field. One main obstacle is the cold-start and data sparseness problem, requiring better solutions. Recent studies have attempted to integrate review information into rating prediction.