英伟达jetson tx1开发套件配置tensorflow

本文为原创作品，未经本人同意，禁止转载，禁止用于商业用途！本人对博客拥有最终解释权

欢迎关注我的博客：http://blog.csdn.net/hit2015spring和http://www.cnblogs.com/xujianqing/

Jetson tx1本身具有的内存和存储容量是比较小的，而且它是基于ARM架构的cpu更是与intel的处理器架构不一样，所以很多网上的分享不适用于TX1的配置。这里从英伟达的jetson hacks上面的官方教程来配置tensorflow。

参考文章：Jetson hacks

http://www.jetsonhacks.com/2016/12/30/tensorflow-nvidia-jetson-tx1-development-kit/

http://www.jetsonhacks.com/2017/01/28/install-samsung-ssd-on-nvidia-jetson-tx1/

http://www.jetsonhacks.com/2016/12/21/jetson-tx1-swap-file-and-development-preparation/

这里分成三步执行

配置外置SSD

这里用的是三星 EVO 250G的SSD，支持SATA接口，ssd插上去开机是不能用的，TX1是没有识别的，需要的格式化为linux支持的文件系统ext4.一系列配置之后可以把ssd设置为外置的存储，然后再把文件系统拷贝到SSD中，设置为从SSD启动系统。和PC机从BIOS启动不一样的是，TX1可以用好几种方式启动它。只要配置exlinux.conf就OK了。仔细看视频就可以配置好

断电，插ssd，上电
通过gui界面来设置，搜索DISKS，如图
格式化新建一个分区

新建：

name输入名字jetsonssd-256

Ok完成。

这里新建分区的时候是要输入盘的大小，输入250G，这个三星的ssd不能全部新建。

配置swapfile

这一步配置交换空间，TX1的内存只有4G，用来配置tensorflow是不够用的，在配置tensorflow之前要给它加一个交换空间，这个空间的功能是当物理内存不够时，将某些内存当中所占的程序暂时移动到swap当中，让物理内存可以被需要的程序来使用。另外，如果你的主机支持电源管理模式，也就是说，你的linux主机系统可以进入休眠状态的话，娜美运行当中的程序状态则会被记录到swap当中去，以作为唤醒主机状态的依据。

命令行：

git clone https://github.com/jetsonhacks/postFlashTX1.git
cd postFlashTX1\
ls
$ sudo ./createSwapfile.sh -d [directory location] -s[size in gigabytes] –a

[directory location]：这里填写你的ssd路径地址

[size in gigabytes]：这里填写你要设置的swap的大小，以G为单位。

-a：是默认开机自启动swap，在/etc/fstab中自己设置

我设置了20G，默认是8G，一般的swap设置为内存的两倍。

下面这段话是该文章里头提到的一点问题，这里我们用不到，但是可以帮助以后找问题。

在SSD或其他闪存类型内存上设置交换文件可能会导致设备磨损。大多数当前闪存具有可以执行的读取和写入的生存期数量，繁忙的交换文件可以消耗大量的那些。请注意，硬盘驱动器也是如此。较新的SSD已经建立了帮助分发"写磨损"的机制。一如往常，备份您的驱动器，并将其存储。在视频中，交换文件在计算机引导时自动安装。这是伟大的开发，但之后，你可能要禁用该功能。为此：

$ sudo gedit / etc / fstab

并注释掉执行"swapon"的行。确保保存文件，重新启动并检查以确保交换已关闭。

此外，你可能想要有一个更硬的核心关于你的交换区。您可以留出"交换分区"，并使用它而不是交换文件。这种方法可能更快，因为交换区域连续放置。此路由类似于设置交换文件，但超出了本文的范围。

运行完后开始在disks里头设置

按照图中正确选择

然后重启系统。

设置从SSD启动系统：

开发板自带的emmc存储只有16G,第一步安装系统后只有4.4G,本次搭建tensorflow环境需要很多存储，所以需要扩展存储，但是不能就单单加个SSD就可以，这样把文件放在SSD是不好用的，需要将系统运行在SSD上才行，按视频教程后半部分将系统文件复制到SSD中，修改启动配置文件，使之从SSD内启动：

$ sudo cp -ax / '/media/ubuntu/jetsonssd'

修改配置文件

The last setup step is to modify the file extlinux.conf file on the eMMC. The system will boot from the internal eMMC, then the kernel will set the root directory to point to the SATA drive

$ cd /boot/extlinux
$ sudo cp extlinux.conf extlinux.conf.original
$ sudo gedit /boot/extlinux/extlinux.conf

这里主要修改几个地方

其实整个文件如下：

TIMEOUT 30
DEFAULT satassd

MENU TITLE p2371-2180 eMMC boot options

LABEL satassd
MENU LABEL primary SATA SSD
LINUX /boot/Image
INITRD /boot/initrd
FDT /boot/tegra210-jetson-tx1-p2597-2180-a01-devkit.dtb
APPEND fbcon=map:0 console=tty0 console=ttyS0,115200n8 androidboot.modem=none androidboot.serialno=P2180A00P00940c003fd androidboot.security=non-secure tegraid=21.1.2.0.0 ddr_die=2048M@2048M ddr_die=2048M@4096M section=256M memtype=0 vpr_resize usb_port_owner_info=0 lane_owner_info=0 emc_max_dvfs=0 touch_id=0@63 video=tegrafb no_console_suspend=1 debug_uartport=lsport,0 earlyprintk=uart8250-32bit,0x70006000 maxcpus=4 usbcore.old_scheme_first=1 lp0_vec=${lp0_vec} nvdumper_reserved=${nvdumper_reserved} core_edp_mv=1125 core_edp_ma=4000 gpt android.kerneltype=normal androidboot.touch_vendor_id=0 androidboot.touch_panel_id=63 androidboot.touch_feature=0 androidboot.bootreason=pmc:software_reset,pmic:0x0 net.ifnames=0root=/dev/sda1 rw rootwait

LABEL emmc
MENU LABEL Internal eMMC
LINUX /boot/Image
INITRD /boot/initrd
FDT /boot/tegra210-jetson-tx1-p2597-2180-a01-devkit.dtb
APPEND fbcon=map:0 console=tty0 console=ttyS0,115200n8 androidboot.modem=none androidboot.serialno=P2180A00P00940c003fd androidboot.security=non-secure tegraid=21.1.2.0.0 ddr_die=2048M@2048M ddr_die=2048M@4096M section=256M memtype=0 vpr_resize usb_port_owner_info=0 lane_owner_info=0 emc_max_dvfs=0 touch_id=0@63 video=tegrafb no_console_suspend=1 debug_uartport=lsport,0 earlyprintk=uart8250-32bit,0x70006000 maxcpus=4 usbcore.old_scheme_first=1 lp0_vec=${lp0_vec} nvdumper_reserved=${nvdumper_reserved} core_edp_mv=1125 core_edp_ma=4000 gpt android.kerneltype=normal androidboot.touch_vendor_id=0 androidboot.touch_panel_id=63 androidboot.touch_feature=0 androidboot.bootreason=pmc:software_reset,pmic:0x0 net.ifnames=0 root=/dev/mmcblk0p1 rw rootwait

标红的地方是修改的

重启ok

如果要从emmc启动，则改一下这里的配置就OK了。Label那个位置该一下。

配置tensorflow

配置tensorflow真是一个巨坑，归功于我们牛逼的墙，导致我翻墙失败，下载安装依赖项的时候数据包下载不完整，配置不成功，还找不到问题的关键。

然后就开始在github里面看配置的源码，一个一个找问题。这里就给出他们的源码，和解析！

其实这个配置就是一段命令行的配置：

上github把英伟达的配置demo下载下来：

git clone https://github.com/jetsonhacks/installTensorFlowTX1
cd installTensorFlowTX1

配置动态链接库的路径

./setLocalLib.sh

下载一些安装依赖项，包括java protobuf等

./installPrerequisites.sh

下载tensorflow源码

./cloneTensorFlow.sh
./setTensorFlowEV.sh

两个选择 n，y

./buildTensorFlow.sh
./packageTensorFlow.sh

这一步要超级权限

sudo pip install $HOME/tensorflow-0.11.0-py2-none-any.whl
cd $HOME/tensorflow
time python tensorflow/models/image/mnist/convolutional.py

这是github里面的文档说明：

最少要8g的内存，然后存储大小最少要5.5g的大小，所以告诉你要给它配置一个硬盘。当然还要设置好库路径，这里面通过setlocallib.sh这个文件来设置。

这里需要编译两个版本的Protobuf，一个(v3.1.0)是用于grpc，一个v3.0.0-beta-2用于bazel，安在：$HOME/lib and $HOME/bin.

Grpc 0.15.0版本，补丁支持arm架构

grpc-java v0.15.0 requires > v3.0.0-beta-3 of protobuf. A patch is applied for aarch64.

Bazel

Builds version 0.3.2. Includes patches for compiling under aarch64.

这个版本里头的补丁支持arm64架构

Before installing TensorFlow, a swap file should be created (minimum of 8GB recommended). The Jetson TX1 does not have enough physical memory to compile TensorFlow. Also, if TensorFlow is being compiled on the built-in 16GB flash drive, a standard JetPack installation may consume too much room on the drive to successfully build TensorFlow. Extraneous files will need to be removed. Eliminating the .deb files in the home directory appears to be enough to allow TensorFlow to build. Successful builds tend to have more than 5.5GB free. Also, for a successful build it is recommended to set local lib using the included script setLocalLib.sh, as grpc-java in particular seems to run into issues if it /usr/local/lib is not in the path.

Note: Most of this procedure was derived from the thread: https://github.com/tensorflow/tensorflow/issues/851

TensorFlow should be built in the following order:

installPrerequisites.sh

Installs Java and other dependencies needed. Also builds:

Protobuf

Two versions of protobuf are compiled. The first (v3.1.0) is needed to build grpc-java. This version ends up being installed in $HOME/lib and $HOME/bin. The second version (v3.0.0-beta-2) is used to build bazel

grpc-java

grpc-java v0.15.0 requires > v3.0.0-beta-3 of protobuf. A patch is applied for aarch64.

Bazel

Builds version 0.3.2. Includes patches for compiling under aarch64.

cloneTensorFlow.sh

Git clones r0.11 from the TensorFlow repository and patches the source code for aarch64

setTensorFlowEV.sh

Sets up the TensorFlow environment variables. This script will ask for the default python library path.

buildTensorFlow.sh

Builds TensorFlow.

packageTensorFlow.sh

Once TensorFlow has finished building, this script may be used to create a 'wheel' file, a package for installing with Python. The wheel file will be in the $HOME directory, tensorflow-0.11.0-py2-none-any.whl

Install wheel file

$ pip install $HOME/tensorflow-0.11.0-py2-none-any.whl

Test

Run a simple TensorFlow example for the initial sanity check:

$ cd $HOME/tensorflow

$ time python tensorflow/models/image/mnist/convolutional.py

Build Issues

For various reasons, the build may fail. The 'debug' folder contains a version of the buildTensorFlow.sh script which is more verbose in the way that it describes both what it is doing and errors it encounters. See the debug directory for more details.

Notes

As of this writing (Jan 15, 2017) the TensorFlow repository has an issue which does not allow incremental compilation to work correctly. This is due to an issue in the file:

tensorflow/third_party/gpus/cuda_configure.bzl

Where the rule:

cuda_configure = repository_rule( implementation = _cuda_autoconf_impl, local = True, )

forces Bazel to always rebuild the CUDA configuration, which in turn foobars the incremental build process. The cloneTensorFlow.sh script patches the file to remove the local = True statement. Additionally, buildTensorFlow.sh sets TensorFlow environment variables to reflect the CUDA structure of the Jetson TX1.

Since v0.11 was published, the location of the zlib library being used has moved. This is also taken into account by the cloneTensorFlow.sh script, which patches the library location.

遇到问题

Unzipping /home/ubuntu/.gradle/wrapper/dists/gradle-2.13-bin/4xsgxlfjcxvrea7akf941nvc7/gradle-2.13-bin.zip to /home/ubuntu/.gradle/wrapper/dists/gradle-2.13-bin/4xsgxlfjcxvrea7akf941nvc7

Exception in thread "main" java.util.zip.ZipException: error in opening zip file

at java.util.zip.ZipFile.open(Native Method)

at java.util.zip.ZipFile.<init>(ZipFile.java:219)

at java.util.zip.ZipFile.<init>(ZipFile.java:149)

at java.util.zip.ZipFile.<init>(ZipFile.java:163)

at org.gradle.wrapper.Install.unzip(Install.java:214)

at org.gradle.wrapper.Install.access$600(Install.java:27)

at org.gradle.wrapper.Install$1.call(Install.java:74)

at org.gradle.wrapper.Install$1.call(Install.java:48)

at org.gradle.wrapper.ExclusiveFileAccessManager.access(ExclusiveFileAccessManager.java:65)

at org.gradle.wrapper.Install.createDist(Install.java:48)

at org.gradle.wrapper.WrapperExecutor.execute(WrapperExecutor.java:128)

at org.gradle.wrapper.GradleWrapperMain.main(GradleWrapperMain.java:61)

解决方法：直接进入

/home/ubuntu/.gradle/wrapper/dists/gradle-2.13-bin/4xsgxlfjcxvrea7akf941nvc7

这个路径，把那个压缩包去掉，然后自己去官网下一个gradle-2.13-bin.zip包，（网络真坑爹），当然是放在原来的路径下喽！

然后就OK！

另一个问题：

说git下载有问题，RPC failed curl 56 GnuTLS recv error (-9)

简单粗暴，卸载git，再重新安装

最后给一个很福利的东西 https://github.com/rwightman/tensorflow/commit/a1cde1d55f76a1d4eb806ba81d7c63fe72466e6d有好东西，一键安装