CppLive 编程在线 » Blog Archive » 从PPStream的缓存文件中提取影片

从PPStream的缓存文件中提取影片

作者 CppLive | 发表于 2011-07-30

您在PPStream上完整看下来的影片在本地是有缓存的，缓存文件大小1G。Windows下的缓存文件名是ppsds.pgf，位置可以通过打开PPStream，在“工具–>选项–>点播服务–>缓存文件管理”里面找到。Ubuntu下的缓存文件名是ems.cache，位置为“~/.pps/datacache”，注意‘~’代表家目录，“.pps”是一个隐藏的文件夹。您最近观看的影片都能从缓存文件中提取成能够正常播放的视频，下面分别介绍Windows和Ubuntu下提取视频的办法。

Windows下可从http://download.csdn.net/source/1292410下载压缩包，里面详细介绍了提取影片的办法，不仅有现成的可执行程序，还有源代码供您参考，这里就不做详细介绍了。

Linux下也是参考了Windows上的实现方式，不过不是用的C语言，而是用Python代码实现的，由于Python具有很好的跨平台性，所以如果您发现Windows上的C语言提取程序已经失效了，不妨也试试下面这段Python代码。

import re
import sys
import imp
import codecs
import logging
import os
import mmap

from struct import unpack_from
from urllib.parse import unquote
from itertools import takewhile

logging.basicConfig(level=logging.INFO,
		format='%(levelname)s %(lineno)d | %(message)s',
		datefmt='%H:%M:%S')

MAXCACHEIDX=1024**3
INFOBLOCK=1616
DATABLOCK=1024**2*2
BLOCK=INFOBLOCK+DATABLOCK

def _getFileInfo(cacheFile):
	ret={}
	with open(cacheFile,'rb') as f:
		mf=mmap.mmap(f.fileno(),0,prot=mmap.PROT_READ)
		firstIDX=1073656324
		idx=firstIDX
		cnt=0

		while idx < MAXCACHEIDX:
			# get 20bytes hash string
			hashid=''.join('%X'%x for x in mf[idx+4:idx+4+20])
			# get number of idxpost items
			nr_idxpos=unpack_from('<L',mf,idx+4+20)[0]
			# get idxpost items
			idxpos=unpack_from('<%s'%('L'*nr_idxpos,),mf,idx+0x11c)
			# get filename
			dataBgn=idx+8476
			cnt+=1
			n=''.join(chr(x) for x in takewhile(lambda x:x!=0,mf[dataBgn:dataBgn+96]))
			n=unquote(n,'gbk')
			if n in ret:
				assert ret[n]['hashid']==hashid
				assert ret[n]['blkall']==blkall
				idx+=9500
				continue
			# get blocks number
			blkall=unpack_from('<L',mf,idx+8476+0x104)[0]
			logging.debug('%d) find filename %s at %d',cnt,n,dataBgn)
			logging.debug('\thash=%s',hashid)
			logging.debug('\tnr_idxpos=%d',nr_idxpos)
			assert nr_idxpos==len(idxpos)
			logging.debug('\t all blks should be %d',blkall)
			if idxpos.count(0)==0 and len(idxpos)==blkall:
				logging.debug('\t-=-=- this is a complete file list -=-=-')
			ret[n]={'filename':n,'blks':None,'blkall':blkall,'hashid':hashid}
			idx+=9500
		mf.close()
	return ret

def _check(cacheFile,info,firstBlock=1836):
	tmpblks={}
	for k,v in info.items():
		tmpblks[v['hashid']]=[0 for _ in range(v['blkall'])]
	with open(cacheFile,'rb') as f:
		f=mmap.mmap(f.fileno(),0,prot=mmap.PROT_READ)
		blkno=unpack_from('!B',f,firstBlock+60)[0]+ unpack_from('!B',f,firstBlock+61)[0]*0xff
		idx= firstBlock
		while 0< idx < 1068274747:
			blkno=unpack_from('!B',f,idx+60)[0]+ unpack_from('!B',f,idx+61)[0]*0x100
			assert blkno<509
			tmpchk=''.join('%X'%x for x in  f[idx+40:idx+60])
			try:
				tmpblks[tmpchk][blkno]=idx
			except:
				logging.debug('%s at %d not in !',tmpchk,idx)
			idx+=BLOCK
		f.close()
	for k,v in info.items():
		if v['hashid'] in tmpblks:
			v['blks']=tmpblks[v['hashid']]
		else:
			logging.debug('%s %s have no data!',k,v['hashid'])
			v['blks']=[]
	return info

def _extractFile(cacheFile,info,outDir):
	ofilename=os.path.join(outDir,os.path.basename(info['filename']))
	with open(cacheFile,'rb') as f:
		f=mmap.mmap(f.fileno(),0,prot=mmap.PROT_READ)
		logging.info('writting to file %s ...',ofilename)
		with open(ofilename,'wb') as fo:
			cnt=0
			for blkidx in info['blks']:
				if blkidx!=0:
					fo.write(f[blkidx+INFOBLOCK:blkidx+INFOBLOCK+DATABLOCK])
					cnt+=1
				else:
					pass
		logging.info('file %s wrote. total %d blocks, %d bytes',ofilename,cnt,cnt*DATABLOCK)
		f.close()

def extractFileFromPPStreamCache(cacheFile):
	info=_getFileInfo(cacheFile)
	info=_check(cacheFile,info)
	cnt=0
	fs=[]
	for k,v in info.items():
		cnt+=1
		logging.info('%d) %s   %d/%d',cnt,k,len([ x for x in v['blks'] if x!=0]),v['blkall'])
		fs.append(k)
	if len(info)>0:	 
		try:
			x=int(input('choose file to extract (1-%d):'%(len(info),)))
		except ValueError:
		       logging.info('input invaild!')
		else:		
			if 0< x <=len(info):
			       _extractFile(cacheFile,info[fs[x-1]],'/tmp/')

if __name__=='__main__':
	imp.reload(sys)
	sys.setdefaultencoding('utf-8')
	extractFileFromPPStreamCache(os.path.expanduser('~/.pps/datacache/ems.cache'))
	logging.debug('done')
	input('press ENTER to exit ...')

将上面这段Python代码保存为名为PPStream.py的文件，由于Python对格式的要求极其严格，请确保您拷贝过去的代码没有破坏上面的对齐或者缩进模式。倘若Python解析器提示格式错误，请尝试用TAB缩进符号重新按照原来的格式缩进。如果您想在Windows上尝试用以上代码来提取视频，请将倒数第三行代码里面的路径该为您的Windows上ppsds.pgf缓存文件存放的路径。

光有上面的代码保存的PPStream.py文件，没有Python解析器，代码是没法执行的。所有如果您系统上没有Python解析器，那就需要安装python，注意得安装3.0以后版本，这里我们在Ubuntu上安装python3，执行如下命令：

sudo apt-get install python3

安装好以后，cd到PPStream.py存放的目录，执行如下命名：

python3 PPStream.py

终端会显示缓存文件中已有的视频列表，提示您选择要提取的视频。按提示输入以后，提取成功的视频会保存在“/tmp”目录下。

如果您拷贝的python仍有问题，可以在下面留言向我索取PPStream.py文件。

除非注明，文章均为CppLive 编程在线原创，转载请注明出处，谢谢。

本文地址：https://www.cpplive.com/html/773.html

jtliaw says:

2012 年 11 月 21 日 at pm 11:43

大大！请给我一份PPStream.py！
另外PPStream.py应该放在哪里？

jtliaw says:

2012 年 11 月 21 日 at pm 11:51

一执行就这样了！
dibing@dibing-Aspire-4935:~/.pps/datacache$ python3 PPStream.py
File “PPStream.py”, line 1
1. import re
^
IndentationError: unexpected indent

Log in to Reply
CppLive says:

2012 年 11 月 27 日 at pm 2:29

好的，邮件已发送，请查收附件~

Log in to Reply

sarrow104 says:

2012 年 03 月 01 日 at am 9:05

刚刚下载试用了一下。

同样，注释掉了124行。

但程序运行起来，就死在那里了。

有pdf格式文件的说明文档吗？

CppLive says:

2012 年 03 月 02 日 at am 10:50

你是用的是Linux下的PPStream吗？版本是0.1.1678吗？你观看过视频以后，是否发现如下路径“~/.pps/datacache/ems.cache”的这个ems.cache文件？还有请务必使用 python3 执行以上脚本。也可能是脚本格式的问题，我将脚本原文件以附件的形式邮件发给你了，请注意查收。

Log in to Reply

YQ君死宅 says:

2011 年 08 月 08 日 at pm 4:28

博主太厉害了！刚开了个新博，看到你这么有技术性的文章……表示压力很大 > <
我要向博主学习阿！！

PS：我用的是Lubuntu，我要将123行的 sys.setdefaultencoding('utf-8') 删除了才OK，原因不明。因为终端弹出错误提示：
File "PPStream.py", line 123, in
sys.setdefaultencoding(‘utf-8’)
AttributeError: ‘module’ object has no attribute ‘setdefaultencoding’

CppLive says:

2011 年 08 月 11 日 at am 12:25

你的这个问题在Ubuntu上没有出现过，如果去掉倒数第五行倒是会出现。在Winows上有跟你同样的问题，我也不大清楚原因，难道Lubuntu默认字符编码跟Windows一样都不是utf-8?

Log in to Reply
- jtliaw says:
  
  2012 年 11 月 22 日 at am 12:24
  
  我的ubuntu12.04删除了第123行后：
  dibing@dibing-Aspire-4935:~/.pps/datacache$ python3 PPStream.py Traceback (most recent call last):
  File “PPStream.py”, line 123, in
  extractFileFromPPStreamCache(os.path.expanduser(‘~/.pps/datacache/ems.cache’))
  File “PPStream.py”, line 104, in extractFileFromPPStreamCache
  info=_getFileInfo(cacheFile)
  File “PPStream.py”, line 43, in _getFileInfo
  assert ret[n][‘hashid’]==hashid
  AssertionError
  
  Log in to Reply

CppLive says:

2011 年 07 月 31 日 at pm 12:55

呵呵多谢捧场 😛

这里因为你的留言而存在!!!

You must be logged in to post a comment.

精选文章

从PPStream的缓存文件中提取影片

评论 (9)

这里因为你的留言而存在!!!

常用页面

Categories

页面

归档

管理

精选文章

从PPStream的缓存文件中提取影片

评论 (9)

这里因为你的留言而存在!!!

常用页面

Categories

热门标签

页面

归档

管理