Thiết kế website giá rẻ

Question

I am currently busy with a project for sensory substitution. It is an extension of a previous project and i dont have the time to convert it to python 3 so have continued working in python 2.7, also working on ubuntu 16 with ROS kinetic. Its a visual to audio system and the previous system took color and converted it to sound. I am getting input from a depth camera so i have a color stream and depth stream and use a combination to produce sound. i have essentially just edited the previous code to detect people and play a sound when it is detected. It plays a sound but it is always in the centre so cant tell where the person is using the system.

This is the format of the images passed to the sound generator, top is the normal image that has detected a human, and bottem is the depth camera. i have converted both images to a 5×10 array of pixels. When a human is recognized a black pixel is placed by their torso.

enter image description here

This is the code that deals with the sound generation:
def sound_generator_algorithm(retinal_encoded_image_cv2_format):
global color_image_cv2_format

if color_image_cv2_format is None:
    rospy.logwarn("Color image format is None, skipping this frame.")
    return

retinal_encoded_image_width = len(retinal_encoded_image_cv2_format[0])
retinal_encoded_image_height = len(retinal_encoded_image_cv2_format)

if not is_setup:
    setup(retinal_encoded_image_cv2_format)

for row in range(retinal_encoded_image_height):
    for column in range(retinal_encoded_image_width):
        # Obtaining depth value
        depth = retinal_encoded_image_cv2_format[row][column]
        # Obtaining color value
        color_value = color_image_cv2_format[row][column]
        color_key = None
        #print(color_value)
        # Find the correct color key
        for color in colors:
            if (colors[color] == color_value).all():
        #print('true')
                color_key = color
                break
    #if not color_key == 'human_back':
    #print(color_value)
    #print(color_key)
        # If color_key is not found, continue to next pixel
        if color_key is None:
            continue

        # Muting all other colored sounds (except current)
        for color in sound_sources[row][column]:
            if color != color_key:
                sound_sources[row][column][color].gain = 0.0

        if np.isnan(depth) or (depth == 0.0) or (color_key != 'black'):
    #if color_key=='black':
            #print('black')
            sound_sources[row][column][color_key].gain = 0.0
        else:
            if color_key == 'black':
        #print('true')
                sound_sources[row][column][color_key].gain = gain_scaled * 2
            else:
                sound_sources[row][column][color_key].gain = gain_scaled / 2

            # Update pitch based on row
            if row == 0:
                sound_sources[row][column][color_key].pitch = 1.7
            elif row == 1:
                sound_sources[row][column][color_key].pitch = 1.3
            elif row == 2:
                sound_sources[row][column][color_key].pitch = 1.0
            elif row == 3:
                sound_sources[row][column][color_key].pitch = 0.7
            elif row == 4:
                sound_sources[row][column][color_key].pitch = 0.3
    #print('here')
            projected_min_depth = ssf_core.projected_pixel(unit_vector_map,
                                                           column,
                                                           row,
                                                           depth_camera_min_depth)[2]

            x_scale = 4
            y_scale = 1.0
            z_scale = 1.3
            z_power_scale = 2.0
            depth = (projected_min_depth * z_scale) + (
                    ((depth - projected_min_depth) * z_scale) ** (z_power_scale * 1.0))
            projected_pixel = ssf_core.projected_pixel(unit_vector_map,
                                                       column,
                                                       row,
                                                       depth)

            # Update the sound sources position based on the projected pixel
            sound_sources[row][column][color_key].position = [projected_pixel[0] * x_scale,
                                                              projected_pixel[1],
                                                              -projected_pixel[2]]

soundsink.update()

The setup method just places a sound source on each pixel for each sound available, in this case its just black and background and background is always muted.
This is the original code which i have made changed to. Sound localization actually works with it
def sound_generator_algorithm(retinal_encoded_image_cv2_format):
retinal_encoded_image_width = len(retinal_encoded_image_cv2_format[0])
retinal_encoded_image_height = len(retinal_encoded_image_cv2_format)

# NOTE: PyAL uses the RHS coordinate system
# Hence, the horizontal extent of the monitor represents the x-axis,
# with right being positive. The vertical extent of the monitor represents
# the y-axis, with up being positive; and from ones eyes going into the 
# monitor represents the positive z-axis.

if not is_setup:
    setup(retinal_encoded_image_cv2_format)

for row in xrange(retinal_encoded_image_height):
    for column in xrange(retinal_encoded_image_width):
        # Obtaining depth value
        depth = retinal_encoded_image_cv2_format[row][column]
        # Obtaining color value
        color_value = color_image_cv2_format[row][column]
        color_key = None

        # Muting all other colored sounds (except current)
        for color in sound_sources[row][column]:
            if (colors[color] != color_value).any():
                sound_sources[row][column][color].gain = 0.0
            else:
                color_key = color

        if np.isnan(depth) or (depth == 0.0) or (depth >= depth_camera_max_depth) or color_key == 'background':
            sound_sources[row][column][color_key].gain = 0.0
        else:
            if color_key == 'background':
                sound_sources[row][column][color_key].gain = gain_scaled / 2
            else:
                sound_sources[row][column][color_key].gain = gain_scaled * 2

            # If the sound isn't muted, also update its pitch
            # dependent on its y value (i.e. row)
            # NOTE: Setting the pitch stretches or compresses the sound
            #       by the given value. For example, if the pitch of a
            #       440Hz tone is set to 2.0, the tone played would be
            #       2 * 440Hz = 880Hz
            if row == 0:
                sound_sources[row][column][color_key].pitch = 1.7
            elif row == 1:
                sound_sources[row][column][color_key].pitch = 1.3
            elif row == 2:
                sound_sources[row][column][color_key].pitch = 1.0
            elif row == 3:
                sound_sources[row][column][color_key].pitch = 0.7
            elif row == 4:
                sound_sources[row][column][color_key].pitch = 0.3

            projected_min_depth = ssf_core.projected_pixel(unit_vector_map,
                                                           column,
                                                           row,
                                                           depth_camera_min_depth)[2]

            x_scale = 4  # 2.5
            y_scale = 1.0
            z_scale = 1.3
            # scales anything beyond the projected_min_depth,
            # scaling it along the ray
            z_power_scale = 2.0
            # NOTE: only the depth is scaled, and then x, y and z are projected
            #       according to that depth.
            depth = (projected_min_depth * z_scale) + (
                    ((depth - projected_min_depth) * z_scale) ** (z_power_scale * 1.0))
            projected_pixel = ssf_core.projected_pixel(unit_vector_map,
                                                       column,
                                                       row,
                                                       depth)

            # Update the sound sources position based on the projected pixel
            sound_sources[row][column][color_key].position = [projected_pixel[0] * x_scale,
                                                              projected_pixel[1],
                                                              -projected_pixel[2]]

soundsink.update()

I have attempted redoing it with minimal changes apart from the human detection to see if i accidentally altered something i was not supposed to, but still get the same result of the sound only being from the centre

Thiết kế website giá rẻ

Danh mục

Sound Localization in Python 2.7