Recommending new products using k-means clusters?
I’m trying to figure out the best way to recommend images based on past classifications using k-means clusters. What I have done is mapped the RGB values of a set of images, performed a k-means cluster analysis on those RGB values, and attached a “rating” to each image. This has created Voronoi cells similar to this graph. I’ve stored the cluster centers and ratings into my “training set”.
How to access large amounts of data for machine learning in a microservices architecture
Imagine you are building a product recommendation algorithm for an eccommerce application which is built as a microservices architecture, this architecture having separate services for users and products. The algorithm should be exposed as a recommendation service which, given a user id, returns a list of recommended products based on their buying history.
How to access large amounts of data for machine learning in a microservices architecture
Imagine you are building a product recommendation algorithm for an eccommerce application which is built as a microservices architecture, this architecture having separate services for users and products. The algorithm should be exposed as a recommendation service which, given a user id, returns a list of recommended products based on their buying history.
Machine Learning – How to incorporate data cleansing into trained model
I have been trying to search for an answer to this question but haven’t found anything that even remotely touches on it.
Question is, if I cleanse the data and impute median value into NaN values, am I supposed to somehow incorporate this into my model that will be used on the test data. In other words, doesn’t my test data need to be cleansed and imputed as well, or will the training take care of this. I want to say it needs to be incorporated, because otherwise the Nan breaks the model.
Thanks!
Is the cross entropy cost value practically used in updating weights of a neural net or is it just used to show how well the network is performing?
I created a neural network that had 10 outputs that were then finally activated by the leaky RELU function. The cost function i used for this network was the mean squared error function multiplied by 0.5 to make the computation of the derivative simple. 0.5(Expected output – network output)^2. The derivative of this is (network output – expected output). Please note the network i created would train on one hot encoding values label values. So the outputs 1 to 10 would represent 10 classes and the output would be mapped to classes and would output a 1 if the class is identified and a zero if not identified.
Bagging vs pasting [closed]
Closed 2 mins ago.
“Shortest program that fits training data is the best possible generalisation” what is this theorem?
In a class of MIT, an orator says that the “Shortest program that fits training data is the best possible generalisation” and that’s it’s not just philosophy (https://en.wikipedia.org/wiki/Occam’s_razor) but actually a mathematical fact backed by a formal (simple) proof.
Does the learning curve suggest overfitting or an acceptable level of model performance?
Machine Learning classification algorithms for multi-level data
I’m working on a machine learning project, and my dataset contains variables about social, demographic and economic aspects of 218 countries, ranging from 1960 to 2022. The target variable is a binary variable (Yes or No) that represents if the country has had at least one attempt of coup d’etat in a specific year.
My question is: what are the best classification models for multi-level data?
AI Language that doesn’t release source code or can’t be decompiled
I looked at using python for the AI in a security application for hobby-business but apparently you either release it as source code or it can be easily decompiled. Of the new machine learning languages/packages, which can be kept confidential?