CH Automated Metadata Extractor - Public Room Timeline

	CH Automated Metadata Extractor	4 Members
	Discussion regarding the CH Automated Metadata Extractor's (chext) development.	2 Servers

Load older messages

Sender	Message	Time
1 Sep 2019
high_octane	Is the solution to this problem fundamentally different from the solution to individually identifying the actresses? Right now I'm thinking they can be solved in similar ways, which would be to train a model to identify them. Not all CHs are made the same, however, and will probably require an alternative solution to be properly identified.	17:54:39
high_octane	Also, if you have suggests for other valuable libraries to help make implementing things easier, please let me know the name of them and what their specific function/role would be. Currently, only OpenCV is used, and I'm unsure if that is enough for what we're trying to accomplish.	17:55:38
high_octane	I read that OpenCV has support for TensorFlow, PyTorch, and Caffe.	17:55:49
adjones	The solution for the beatmeter should be similar to the actress identification. Therefore, I think at least TensorFlow is required as we have to train our own model. We also should find some pretrained ones, as we dont have a huge dataset...	19:06:45
high_octane	What exactly defines a pretrained model? I'm new to machine learning, so I'm a bit perplexed by this. I was under the assumption that we'd have to train a model ourselves in order to look for something specific, like beats on the beat meter.	19:25:11
high_octane	I just read the differences between OpenCV and TensorFlow, and it seems that they are applied to solving different problems. Perhaps I had some misconceptions about OpenCV. TensorFlow might be more suitable for this project.	21:32:50
2 Sep 2019
high_octane	Okay, how about this. We'll use OpenCV to process the CHs as video frames, and use the models that were trained using TensorFlow for identification of beats/actresses/etc (via OpenCV's support of TensorFlow models).	01:01:38
high_octane	This will decouple OpenCV and TensorFlow, which means that we can use TensorFlow via the Python API (with Keras, if you want) for the model training. You mentioned that you would prefer to use Python, and I agree. Using TensorFlow's C++ API looks nightmarish, so using Python seems like the best course of action.	01:05:53
adjones	In reply to @high_octane:matrix.org What exactly defines a pretrained model? I'm new to machine learning, so I'm a bit perplexed by this. I was under the assumption that we'd have to train a model ourselves in order to look for something specific, like beats on the beat meter. Ok so in machine learning it is common to use so called pretrained models and fine tune them to a specific task. Normally a model has thousands or more parameters. To train them a huge amount of data is needed (that also has to be labelled). For small datasets or similar tasks you can use an existing model which parameters are already "good" for your kind of problem. Then train this model with the small database, so that the results are good enough. E.g. there are models for face detection or face classification, but probably not for pornstars. However, we can use this existing models and train them with some pornstar images so that they learn to detect this specific person. But we dont need to teach the network what a face looks like. For the beatmeter this may be a bit different. Nevertheless, we can try to use e.g. character detection networks or something else that looks similar to beatmeters...	16:30:46
adjones	In reply to @high_octane:matrix.org This will decouple OpenCV and TensorFlow, which means that we can use TensorFlow via the Python API (with Keras, if you want) for the model training. You mentioned that you would prefer to use Python, and I agree. Using TensorFlow's C++ API looks nightmarish, so using Python seems like the best course of action. If it is possible to load models generated via python in c++ this would also be my prefered choice. Python as a language is relatively easy to learn, especially if you already worked with other programming languages. OpenCv in general also has some machine learning features, but for complicated tasks like face detection I think Tensorflow works better. As we dont have to build models from scratch...	16:34:53
adjones	If anything is unclear just aks and ill try to explain it better :)	16:35:32
high_octane	Okay! So pretrained models are something you can build on top of. Thanks for the clarification, adjones! Will you be contributing to the code base as well? Of course, you're under no obligation to do so if you don't want to. You've been a great help thus far. 😃	20:11:36
3 Sep 2019
adjones	Yes I will be contributing to the code as well maybe not this week but next week I have more time. Probably I will do more Python than C/C++ code, as thats what I have been working with the last months. I will try to pull the repository as well as get some IDEs at the latest before the weekend...	06:45:11
high_octane	Okay Cool! I got TensorFlow for Python, but I wanted to build it from scratch because it wasn't taking advantage of some CPU extensions (like AVX2). So I was sitting there watching it compile forever, and then it failed to build, saying that a dependancy needed VisualStudio Build Tools 2017. I went to download those tools, only to find out that Microsoft no longer supports WIndows 8. ☹️ Just my luck...	07:00:08
high_octane	* Okay cool! I got TensorFlow for Python, but I wanted to build it from scratch because it wasn't taking advantage of some CPU extensions (like AVX2). So I was sitting there watching it compile forever, and then it failed to build, saying that a dependency needed VisualStudio Build Tools 2017. I went to download those tools, only to find out that Microsoft no longer supports Windows 8. ☹️ Just my luck...	07:05:42
high_octane	I've been trying to understand how the mp4 spec handles variable framerate video, and I'm having some bad luck. Right now, there's a bug in the mp4 parser where the extracted fps denominator is incorrect if there are multiple frame deltas. It just so happens that 3xTripleXXX's Getting Down with the Thiccness has a variable framerate intro, for some reason.	07:12:34
high_octane	In terms of IDEs, I used to use Eclipse and Notepad++ in the past, but I eventually began using Vim and never looked back.	07:15:26
high_octane	My favorite language to program in is C. It's such a simple yet powerful language. The only real issue with C is that development takes a much longer amount of time, compared to other languages. I don't like to use C++ at all, especially its template feature. I'm also not a big fan of classes in any language. I have a very hard time thinking in an object-oriented mind set. I tend to prefer thinking about things in terms of data. It's too bad that so many libraries are written in C++. I like Python almost as much as C. Sometimes, some of its features confuse me a little (like list comprehensions) because I'm too used to thinking about solving problems closer to the hardware, like in C. I do love how rapid development can be with Python, as well as its syntax. A big issue I have with Python is how it didn't just treat strings as bytes from the start. That eventually led to some messiness later on.	07:37:23
adjones	Guess for me its more the other way around. Never worked with hardware near programming. Learned the "modern" OOP concepts in C++ and also used them in Python.	19:09:54
high_octane	C was my first programming language, so I'm probably a bit biased as a result. I've never really used classes in Python, but I can give it a shot if you're more comfortable/familiar with using them.	19:14:10
high_octane	Quick note, I'm not going to be working on this for a few of days because I need to make a beat meter for an up-and-coming CH creator. I'll be back after then.	19:34:09
doremi	So, this is hyerspace here. LOL!!! high_octane , Xity said the he was busy for the next month or so because of real life.	22:04:03
high_octane	What's a hyerspace? I sent those bug fixes to him like a month ago.	22:10:49
high_octane	August 4th.	22:15:43
high_octane	Xity's original solution to the problem wasn't the easiest to maintain. I suggested the use of modulus as a means to create a boundary in an array of videos IDs. That way, when you increment through the indices, or decrement, the modulus function would keep you in bounds.	22:25:32
high_octane	His original solution consisted of associative arrays containing the next relative video and previous relative video, for each video (if I recall correctly).	22:27:40
high_octane	I just went back to the thread and noticed that he is planning on doing a full rewrite, which I had not previously known. Hopefully, he will consider my suggestion for that specific problem.	22:30:28
4 Sep 2019
doremi	Re: hyerspace, I meant hyperspace, like being kicked out by the door bouncer. ;-)	00:41:01
high_octane	Oh! 🤣 Yes, here I am in hyperspace! I can be daft sometimes, bare with me.	01:37:53
high_octane	I totally forgot about my analogy from earlier.	01:38:47

Show newer messages

Back to Room ListRoom Version: 4