Creating tasks on Mechanical Turk to allow workers to perform human intelligence tasks. Typically, these tasks involve building databases used by machine learning algorithms. The layout of the tasks is designed in a way they can be built according to the client's wishes.
We assisted in the collection of data during a lead generation campaign. This consisted of a meeting with the client and then the drafting of a document so that the client could collect his data in such a way the subcontractors could make the best use of the data from the direct marketing campaigns (mailings, forms, ...)
Implementation of a generative adversarial network (GAN). It consists of two neural networks competing against each other. This technique was used to generate completely new cartoon images from a training database.
In addition, we explored improvements to GAN, such as the so-called Wasserstein method, which is supposed to improve the quality of the generated images.
In this context, we investigated whether it was necessary to use such a complexification for cartoon images, which are considered low-complexity images (as opposed to HD images for example). It was shown that such a complexification was not necessary for this particular data type.
Principal component analysis is a technique often used to represent data in a smaller dimensional space (2D or 3D). It can also be used to identify outliers (e.g. fraudsters). However, it cannot capture non-linear relationships. Kernel PCA solves this problem by representing the data in a higher dimension to make them linearly separable.
It is a very powerful technique. Kernels are also used in other algorithms (e.g., support vector machines). In addition, random Fourier transforms have also been used to speed up the performance of the algorithm.
Nowadays, data privacy is a very important and sensitive issue in the business world. However, there are ways to protect data owners from attacks or misuse of data that is often stored on servers. On a public data set, (pseudo)-anonymization using a hash function for direct identifiers was used.
Moreover, the data was modified in such a way that it is impossible to identify individuals in terms of quasi-identifiers. This is particularly useful when sensitive data is involved (e.g., diseases, credit capacity, etc.). This could result in the data being less useful.
Therefore, it was compensated for by using a utility measure (defined here as the change in entropy).
Network analysis is used to study the relationships between typical individuals. These techniques were used to make a network analysis of the characters of a movie from its script. It was possible to deduce the strength of the relationships between them via a matrix representation of co-occurrence graphs.
Then, we used an algorithm (Leuven algorithm) to highlight different communities with a relatively low time complexity. Moreover, we implemented a greedy algorithm to optimize the propagation of information through the network.
Copyright © 2022 Troople - Tous droits réservés.