late post
This post should have been up here last sunday the latest.
Last time I wrote that I was going to test to see if our method was working for simulator created data that doesn't include any noise or reverberations. I added a small part in the energy method that calculates the ratio of energy above the treshold value to all energy in each frame. If the ratio is less than 20db, I say that the frame is useless and do not use it to find the orientation. All said and done, I got near perfect results. I'm saying near perfect because I found out 2 "minor" problems.
1) When we are close to a set of microphones and we are looking right at the opposite direction than the mics we are close to (that is the aiming microphones are considerably distant) I got a bias of almost 10 degrees.
2) In files that worked really really well, there was usually one frame that would screw up, giving a difference of 10-20 degrees, but only in one frame. This was usually the same frame in each simulated data (remember i use the same clean speech file) so I think that is because of some special speech case.
So after these results, I started thinking about what to do next and came up with the following ideas:
1) check to see what is causing the bias I described.
2) check to see what is causing the problem described in 2.
3) trying to set up the discriminator without the clean speech
4) try to come up with ways to deal with the reverberant energies
5) try to come up with ways to use the energy method without position data.
After discussing these with prof silverman, i decided to attack the first problem. My first suggestion was that it was caused by the simulator. The orientation data that the simulator uses suggest that the very high frequencies aren't right at the front but slightly to the left or to the right, so I thought in longer distances this had caused the bias. I was dramatically wrong.
In the original energy method after interpolating the energies of the microphones to angles, a front to back ratio is taken. I found out that when we are close to the microphone we are aiming we screw up when we are not taking this ratio, however when we are far away from the aiming microphone then we screw up when we take this ratio. The reason to that problem is to do with the heights of the microphones. I have set up the problem and I'll explain it in my next post, it's complicated and it's not anything to do on the computer but it has to be solved or thought about using pen and paper.
coming soon...!
Last time I wrote that I was going to test to see if our method was working for simulator created data that doesn't include any noise or reverberations. I added a small part in the energy method that calculates the ratio of energy above the treshold value to all energy in each frame. If the ratio is less than 20db, I say that the frame is useless and do not use it to find the orientation. All said and done, I got near perfect results. I'm saying near perfect because I found out 2 "minor" problems.
1) When we are close to a set of microphones and we are looking right at the opposite direction than the mics we are close to (that is the aiming microphones are considerably distant) I got a bias of almost 10 degrees.
2) In files that worked really really well, there was usually one frame that would screw up, giving a difference of 10-20 degrees, but only in one frame. This was usually the same frame in each simulated data (remember i use the same clean speech file) so I think that is because of some special speech case.
So after these results, I started thinking about what to do next and came up with the following ideas:
1) check to see what is causing the bias I described.
2) check to see what is causing the problem described in 2.
3) trying to set up the discriminator without the clean speech
4) try to come up with ways to deal with the reverberant energies
5) try to come up with ways to use the energy method without position data.
After discussing these with prof silverman, i decided to attack the first problem. My first suggestion was that it was caused by the simulator. The orientation data that the simulator uses suggest that the very high frequencies aren't right at the front but slightly to the left or to the right, so I thought in longer distances this had caused the bias. I was dramatically wrong.
In the original energy method after interpolating the energies of the microphones to angles, a front to back ratio is taken. I found out that when we are close to the microphone we are aiming we screw up when we are not taking this ratio, however when we are far away from the aiming microphone then we screw up when we take this ratio. The reason to that problem is to do with the heights of the microphones. I have set up the problem and I'll explain it in my next post, it's complicated and it's not anything to do on the computer but it has to be solved or thought about using pen and paper.
coming soon...!



