通過三維x,y,z散點圖數據擬合直線

更新時間:2024-04-02 12:54:13

問題闡述

我有幾個數據點,它們在3D空間中沿著一條線聚集。我在CS✅V文件中有要導入的x、y、z數據。我想找一個方程來表示這條線,或者垂直于這條線的平面,或者任何數學上正確的東西。這些數據是相互獨立的🃏。也許有比我試著做的更好的方法來做這件事,但是...

我試圖在這里復制一個舊帖子,它似乎正在做我想要做的事情 Fitting a line in 3D

但似乎過去十年的更新可能導致代碼的第二部分無法運行?或許我只是做錯了什么。我已經把我從這里科學地組合在一起的整個東西都放在了底部。有兩行似乎給我帶來了麻煩。

我在這里截獲了它們...

import numpy as np

pts = np.add.accumulate(np.random.random((10,3)))
x,y,z = pts.T

# this will find the slope and x-intercept of a plane
# parallel to the y-axis that best fits the data
A_xz = np.vstack((x, np.ones(len(x)))).T
m_xz, c_xz = np.linalg.lstsq(A_xz, z)[0]

# again for a plane parallel to the x-axis
A_yz = np.vstack((y, np.ones(len(y)))).T
m_yz, c_yz = np.linalg.lstsq(A_yz, z)[0]

# the intersection of those two planes and
# the function for the line would be:
# z = m_yz * y + c_yz
# z = m_xz * x + c_xz
# or:
def lin(z):
    x = (z - c_xz)/m_xz
    y = (z - c_yz)/m_yz
    return x,y

#verifying:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

fig = plt.figure()
ax = Axes3D(fig)
zz = np.linspace(0,5)
xx,yy = lin(zz)
ax.scatter(x, y, z)
ax.plot(xx,yy,zz)
plt.savefig('test.png')
plt.show()

它們返回此值,但不返回值...

FutureWarning:rcond參數將更改為機器精度時間的默認值max(M, N),其中M和N是輸入矩陣的維度。要使用將來的默認設置并使此警告靜默,我們建議傳遞rcond=None,繼續使用舊的顯式傳遞rcond=-1。 M_xz,c_xz=np.linalg.lstsq(A_xz,z)[0] FutureWarning:rcond參數將更改為機器精度時間的默認值max(M, N),其中M和N是輸入矩陣的維度。要使用將來的默認設置并使此警告靜默,我們建議傳遞rcond=None,繼續使用舊的顯式傳遞rcond=-1。 M_yz,c_yz=np.linalg.lstsq(A_yz,z)[0]

我不知道從這里💟到哪里去。我甚至不需要劇情,我只需要一個方程式,我沒有準備好繼續前進。如果有人知道一種更簡單的方法,或者能為我指明正確的方向,我愿意學習,但我非常非常迷茫。提前感謝!!

這是我的完整Frankensteven代碼,以防這是導致問題的原因。

import pandas as pd
import numpy as np
mydataset = pd.read_csv('line1.csv')

x = mydataset.iloc[:,0]
y = mydataset.iloc[:,1]
z = mydataset.iloc[:,2]


data = np.concatenate((x[:, np.newaxis], 
                       y[:, np.newaxis], 
                       z[:, np.newaxis]), 
                      axis=1)


# Calculate the mean of the points, i.e. the 'center' of the cloud
datamean = data.mean(axis=0)

# Do an SVD on the mean-centered data.
uu, dd, vv = np.linalg.svd(data - datamean)

# Now vv[0] contains the first principal component, i.e. the direction
# vector of the 'best fit' line in the least squares sense.

# Now generate some points along this best fit line, for plotting.

# we want it to have mean 0 (like the points we did
# the svd on). Also, it's a straight line, so we only need 2 points.
linepts = vv[0] * np.mgrid[-100:100:2j][:, np.newaxis]

# shift by the mean to get the line in the right place
linepts += datamean

# Verify that everything looks right.

import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d as m3d

ax = m3d.Axes3D(plt.figure())
ax.scatter3D(*data.T)
ax.plot3D(*linepts.T)
plt.show()

# this will find the slope and x-intercept of a plane
# parallel to the y-axis that best fits the data
A_xz = np.vstack((x, np.ones(len(x)))).T
m_xz, c_xz = np.linalg.lstsq(A_xz, z)[0]

# again for a plane parallel to the x-axis
A_yz = np.vstack((y, np.ones(len(y)))).T
m_yz, c_yz = np.linalg.lstsq(A_yz, z)[0]

# the intersection of those two planes and
# the function for the line would be:
# z = m_yz * y + c_yz
# z = m_xz * x + c_xz
# or:
def lin(z):
    x = (z - c_xz)/m_xz
    y = (z - c_yz)/m_yz
    return x,y

print(x,y)

#verifying:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

fig = plt.figure()
ax = Axes3D(fig)
zz = np.linspace(0,5)
xx,yy = lin(zz)
ax.scatter(x, y, z)
ax.plot(xx,yy,zz)
plt.savefig('test.png')
plt.show()

精準答案

old post you refer to中所建議的,您還可以使用主成分分析而不是最小二乘方法。為此,我建議sklearn package中的sklearn.decomposition.PCA

可以使用您提供的csv文件在下面找到一個示例。

import pandas as pd
import numpy as np
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

mydataset = pd.read_csv('line1.csv')

x = mydataset.iloc[:,0]
y = mydataset.iloc[:,1]
z = mydataset.iloc[:,2]

coords = np.array((x, y, z)).T

pca = PCA(n_components=1)
pca.fit(coords)
direction_vector = pca.components_
print(direction_vector)


# Create plot
origin = np.mean(coords, axis=0)
euclidian_distance = np.linalg.norm(coords - origin, axis=1)
extent = np.max(euclidian_distance)

line = np.vstack((origin - direction_vector * extent,
                  origin + direction_vector * extent))

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(coords[:, 0], coords[:, 1], coords[:,2])
ax.plot(line[:, 0], line[:, 1], line[:, 2], 'r')